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(57) Abstract 

The present invention relates lo a method for preparing positive polypeptide variants by shuffling different nucleotide sequences 
of homologous DNA sequences by in vivo recombination comprising the steps cf a) forming at least one circular plasmid comprising a 
DNA sequence encoding a polypeptide, b) opening said circular plasm id (s) withm the DNA sequencc(s) encoding the polypeptide^), c) 
preparing at leas; one DNA fragment comprising a DNA secuer.ee homologous to a: least a part of the polypeptide coding region on a: leas; 
ore of the circular p!asm:d(s), d) introducing a: leas: ore cf said opened plasmid ;$). together with at leas: one of said homologous DNA 
fragment(s) covering full-length DNA sequences encoding said poiypeptidc(s) or parts thereof, into a recombination host cell, e) cultivating 
said recombination host cell, and 0 screening for positive polypeptide variants. 
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Despite the existence of the above methods there are still need fo^ 
even better iterative in vivo recombination methods for preparinc 
novel positive polypeptide variants. 



SUMMARY OF THE INVENTION 

The object of the present invention is to provide an improved method 
for preparing positive polypeptide variants'" By an in vivo 
recombination method. 

The inventor of the present: invention have surprisingly found that 
such positive polypeptide variants may advantageously be prepared bv 
shuffling different nucleotide sequences of homologous DNA sequences 
by in vivo re ccmbi- a tion ccmp ::s::.: tr.e seeps of 

ai forcing a: leas: one circular piasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening saic circular piasmidis. within the DNA sequence(s) 
encoding the polypeptide (s , , 

c) preparinc at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of the polypeptide coding region on at 
least one of the circular piasmid (s), d) introducing at least one 
of saic opened p 1 asmi d ; s ; , together with at least one of said 
homologous DNA fragment (s ; covering full-length DNA sequences 
encoding said polypeptide ( s , or parts thereof, into a recombination 
host ceil, 

e) cultivating sale recombination host ceil, and 

f) screening for positive polypeptide variants. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the yeast expression piasmid pJS026 comprising DNA 

sequence encoding the Humicola lanuginosa lipase gene. 

Figure 2 shows the yeast expression piasmid pJS037, comprising DNA 

sequence encoding the Humicola lanuginosa lipase gene containing 

twelve additional restriction sites. 

Figure 3 shows the piasmid pJS026. 

Figure 4 shows the piasmid pJS037. 

Figure 5 shows the in vivo recombination of the 0.9 kb synthetic 
wild-type Humicola lanuginosa lipase with pJS037 using Saccharomyces 



WO 97/07205 PCT/DK96/00343 



cerevisiae as the recombination host ceil (described in Example 1). 
Figure 6 shows the in vivo recombination of a DNA fragment prepared 
from Humicola lanuginosa lipase variant fy) with Humicola lanuginosa 
lipase variant (d) comprised in a plasmid using Sacchazomyces 
cerevisiae as the recombination host cell (described in Example 2) . 
Figure 7 shows an overview over the location of the" inactivation 
site of the Humicola lanuginosa lipase gene and the number of the 
clone (referred to as "blue number" in the tables). Location of 
restriction enzyme sites and clone numbers are relative to the 
initiation codor. of the lipase gene. In all cases a stop ccdcn was 
located in the new reading frame 10 50 bp from the frameshift. 
Figure S shews an overview cf the c:ea::o:. of active humicola 
lanuginosa lipase genes fro.-, the :e::.-::.a:::r.5 in table 2k and E 
by a "mosaic -echanism". Lines indicate the introduction of the 
fragment sequence the vector and lines with a x indicate 

sequences that are not introduced in tne active lipase colonies. 
The primers usee for the ?C?. fragment are shown together with the 
location cf the frame shift mutation (rtar-:ed by the restriction site 
used for the construction) . 

Figure 9 shows an overview of fragments used in the recombination 
of 2 partial overlapping fragments into a gapped vector. The 
primers used for the ?C~ fragments are shown together with the 
location cf the frameshift mutation (if not wild type) . 
Figure 10 shows an overview of fragments used in the recombination 
of 3 partial overlapping fragments into a gapped vector. The 
primers used for the ?CR fragments are shown. The overlap between 
PCR353 and 355 is only a 10 bp. 

DETAILED DESCRIPTION OF THE INVENTION 

The object of the present invention is to provide an improved method 
for preparing positive polypeptide variants by an iterative in vivo 
recombination method. 

The inventor of the present invention have surprisingly found an 
efficient method for shuffling homologous DMA sequences in an in 
vivo recombination system using a eukaryotic ceil as a recombination 
host ceil.. 
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A 



recombination host cell" is in the context Qf 
invention a cell " capable of mediating shuffli ng of a nuler 0 

homologous DNA sequences. 

Th. te™ "shuffiina" means recombination of „ ucl eotide 5e , ue „ ce(s) 
between two or »ore ncolooous DNA sequences result^ in J tpuc * 
sequences u.e. 0»A sequences havinc bee „ subifecTea to a 
cycle, »,in 9 . n*.- o f nucleotides e,c„a„ 9 e d , in Lit" tT 

the input DNA seouences (v e s^-t-^ * 

smarting point homologous DNP 
sequences) . y " UN ^* 

~? r si.e, is created, whirr. - D - --.^--..^ . . 

method. " - S - v5r " Peon's 



^ r - r advantage of the present - . 

using a fixture o- " wne " 

. * a - -- S " e - /e=tor5 the screeninc 

• e ""r e ?£lr — " ™ tr.ple^se as can be seen in a cou^ 0 " 
examples below) . - ^ °* 

:ne vivo -^combination method cf f-- c - , 

i -en^^cn simolo to no - - ^ 

and results in a h.-'c- ' — a * * . u ° ? ~ i0 -- 

- , r " ^ of -^-oiogocs genes or 

:;;r; : 0 ; lar9e . a ^ er of ^" -.^0^ gen es Cd n be 

' — tion. The nixing cf improved variants or w uh 

r ype genes followed by screening increases the number of furtn^ 

i:ii:::: Si r ants m — - — ^ -t; 

Recombination of multiple overlapping fragments is oossibi* w<th a 

Zsl ZT- inC ™ ^ 1=1X11,9 ° f VariantS " ~ h ^o« 
f V ; V ° ~ ^ overlap as small as 

10 bp ls sufficient for recombination which may be utilized for very 
easy-oomain shuffling of even distantly related genes. 

IT Z^Z T« 8 to 3 msthod for pre?aring pol - vp — — 

y snu.«x„g different nucleotide sequences " of homologous DNA 
sequences by in vivo recombination comprising the steps of 
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a) forming at least one circular piasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular piasmid (s) within the DNA sequence (s) 
encoding the polypeptide (s) , 

5 c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of the polypeptide coding region on at 
least one of the circular plasmid(s), d) introducing at least one 
of said opened piasmid (s) , together with at least one of said 
homologous DNA fragment (s) covering full-length- DNA sequences 
10 encoding said polypeptide (s ) or parts thereof, into a recombination 
host cell, 

e) cultivating said recombination host ceil, and 

f) screening for positive polypeptide variants. 

15 According to the invention -ore thar. rr.e cycle of step a) to f) may 
be oerformed. 

The ooenir.o of the olasmid ; s ) :r. step b} can be directed toward any 
site within the coivoeotide coding region of the piasmid. The 
2 0 plamid(s) may be opened by any suitable methods known in the art. 
The opened ends of the oiasmid may be fiiied-in with nucleotides as 
described in Pompon e: al. (1939), supra). It is preferred not to 
fill in the coer.ed er.ds as it might create a frameshift. 

25 It is ^referred to open the piasmid is) around the middle of the 
ooiypeptide codir.c 3::.-. sequence (s), as this is believed to result in 
a more effective recombination between DNA fragment ( s ) and opened 
piasmid ( s ) . 

30 In an embodiment of the invention the DNA fragment (s) is (are) 
prepared under conditions resulting in a low, medium or high random 
mutagenesis frequency . 

To obtain low mutagenesis frequency the DMA sequence (s) (comprising 
35 the DNA fragment (s)) may be prepared by a standard PCR amplification 
method (US 4,633,202 or Saiki et ai., (1988), Science 239, 48*7 - 
491)*. 

A medium or high mutagenesis frequency may be obtained by performing 
40 the' PCR amplification under conditions which increase the mis- 
incorporation of nucleotides, for instance as described by Deshler, 
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(1992), GATA 9(4), 103-106; Leung et al., (1989), Technique, Vol. l, 
No. 1, 11-15. 

It is also contemplated according to the invention to combine the 
5 PCR amplification (i.e. according to this embodiment also DNA 
fragment mutation) with a mutagenesis step using a suitable physical 
or chemical mutagenizing agent, e.g., one which induces transitions, 
transversions, inversions, scrambling, deletions, and/or insertions. 

10 In the context of the present invention the term "positive poly- 
peptide variants" means resulting polypeptide variants possessing 
functional properties which has been improved in comparison to th° 
polypeptides producible from the corresponding input DMA sequences. 
Examples, of such improved properties can be as different as e.g. 

15 biological activity, enzyme washing performance, antibiotic resis- 
tance etc. 

Consequently, which screening methtc to be used for identifying 
positive variants depend cr. the desirec improved property of the 
20 polypeptide variant in question. 

If, for instance, the polypeptide in question is an enzyme and the 
desired improved functional property is the wash performance, the 
screening in step f . may conveniently be performed by use of a 
25 filter assay based cn the relieving principle: 

The recombination h ; s t ceil is incurated on a suitable medium and 
under suitable conditions for the enzyme to be secreted, the medium 
being provided with a double filter comprising a first protein- 

30 binding filter and cn top of that a second filter exhibiting a low 
protein binding capability. The recombination host cell is located 
on the second filter. Subsequent to the incubation, the first filter 
comprising the enzyme secreted from the recombination host cell is 
separated from the second filter comprising said cells. The first 

35 filter is subjected to screening for the desired enzymatic activity 
and .the corresponding microbial colonies present on the second 
filter are identified. 

The filter used for binding the enzymatic activity may be any 
40 protein binding filter e.g. nylon or nitrocellulose. The topfilter 
carrying the colonies of the expression organism may be any filter 
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that has no or low affinity for binding proteins e.g. cellulose 
acetate or DuraporeO. The filter may be pre-treated with any of the 
conditions to be used for screening or may be treated curing the 
detection of enzymatic activity. 

The enzymatic activity may be detected by a dye, fluorescence, 
preciDitation, pH indicator, IR-absorbance or any other Jtnown 
technique for detection of enzymatic activity. 

The detecting compound may be immobilized by any immobilizing agent 
e.g. agarose, agar, gelatine, polyacrylamide , starch, filter paper, 
cloth; or any combination of immobilizing agents. 

polypeptide is not 
the polypeptide may 

In an embodiment of the i r. v er.zizr. at -east one shuffling cycle is a 
backcrossing cycle w i c h the initial!*, used DNA fragment, which may 
be the wild-tvoe DNA fragment. This eliminates non-essential muta- 
tions. Non-essential mutations may aisc be eliminated by using wild- 
tvpe DNA fragments as the initially used input DNA material. 

e mechtc of the invention is suitable 
including enzym.es such as proteases, 
amylases, ceiiuiases, peroxidases and 

Also contemplated according to the invention is polypeptides having 
biological activity such as insulin, ACTK, glucagon, somatostatin, 
somatotropin, thymosin, parathyroid hormone, pigmentary hormones, 
somatomedin, erythropoietin, luteinizing hormone, chorionic 
gonadotropin, hypothalamic releasing factors, antidiuretic hormones, 
thyroid stimulating hormone, relax::., interferon, thrombopoietin 
(T?0) and prolactin. 

Especially contemplated according to the present invention is 
initially to use input DNA sequences being either wild-type, variant 
or modified DNA sequences, such as a DNA sequences coding for wild- 
type, variant or modified enzymes, respectively, in particular 
enzymes exhibiting lipolytic activity. 



If the improved functional property o: tr.e 
sufficiently ccoc after one cycle o z s nut cling, 
be subiecced to another cycle. 



It is to be understood that th 
for ail types of polypeptide, 
amylases, lipases, cutinases, 
oxidases . 
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In an embodiment of the invention the lipolytic activity is a lipas 
activity derived from the filamentous fungi of the Humicola sp., i: 
particular Humicola lanuginosa, especially Humicola lanuginosa . 

In a specific embodiment of the invention the initially used input 
DNA fragment to be shuffled with a homologous polypeptide is the 
wild-type DNA sequence encoding the Humicola lanuginosa lipase 
derived from Humicola lanuginosa DSM 4109 described in E? 305 21c 
(Novo Nordisk A/S) . 

Al so specifically encompassed by the scope of the invention is inout 
DNA sequences selected from the croup of vectors (a) to ff) and/or 
DNA fragments (q) zz (aa } ceding : : r Humicola lanuginosa lipase 
variants from the list be low in the Material and Method section. 

Throughout the present application tr.e name Hu~i cola lanuginosa has 
been used to identify one preferred tarent e.ntyr.e, i.e. the one 
mentioned immediately above. However, in recent years H. lar. ucinosa 
has also been termed Thermoxyces lar.u gir.osus (a species introduced 
the first time by Tsikiinsky in 1939' since the fungus show 
morphological and physiological similarity to Thermo^yces 
lanugir.osus . Accordingly, it will be understood that whenever 
reference is made to H . lanuginosa this term could be replaced bv 
Ihermomyces lanugir.osus . The DNA encoding part of the 18S ribosomai 
gene from Ther^orr/ce s lanuginosus (or H. lanuginosa) have been 
sequenced. The resulting IBS sequence was compared to other 185 
sequences in the GenBank database and a phylogenetic analysis usinc 
parsimony (?AU?, Version 3.1.1, Smithsonian Institution, 1993} have 
also been made. This clearly assigns Thermomyces lanuginosus to the 
class of Fleet omyc etes, probably to the order of £urotiales. 
According to the Entrez 3rowser at the NC3I (National Center for 
Biotechnology Information) , this relates Thermomyces lanuginosus to 
families like Eremascaceae, Monoascaceae , Pseudoeurotiaceae and 
Tzichocomaceae, the latter containing genera like Emericella , 
Aspergillus, Penicillium, Eupenicillium, Paeciiom'yces , Talaromyces, 
Thermoascus and Sclerocleista . 

Consequently, such genes encoding lipolytic enzymes of filamentous 
fungi of the genera Emericella , Aspergillus, Penicillium, 
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Eupenicillium, Paecilomyces, Talaromyces, Thermoascus and 
Sclerocleista are also specifically contemplated according to the 
present invention . 

5 Other examples of relevant filamentous fungi genes encoding 
lipolytic enzymes include strains of the Absidia sp. e.g. the 
strains listed in WO 96/13578 (from Novo Nordisk A/S) which are 
hereby incorporated by reference. Absidia sp. strains listed in WO 
96/13578 include Absidia blakesleeana , Absidia corymbifera and 
10 Absidia reflexa. 

Strains of Rhizopus sp., in particular Rh. niveus and Rh . oryzea are 
also contemplated according to the invention. 

Id The lipolytic gene mav also be derivec from a bacteria, such as a 
strain of the Ps e v cor or as sp . , in particular Ps . fra ci P s 
szuzzeri, Ps . cepacia ar.z Ps . fl'soreszers (WO 33/04351) , or Ps . 
plan tar ii or Ps . gladioli [ "J 3 4 , 9 50, 4 1 ~ or Ps . a ica 1 1 ceres and Ps . 
pseudoalcaligenes \ Z ? 215 272, E? 351 375, or WO 94/25575 

20 (disclosing variants of the ?s . pse-jdoa ical'igenes lipolytic enzvme), 
the Pseudortonas sp . variants disclosed in E? 407 225, or a 
Pseudcnonas sp. lipolytic enzyme, such as the Ps . mendocina (also 
termed Ps . puzida) lipolytic enzyme described in WO 85/09357 and US 
5 , 3 5 9, 535 or variants there:: as described in US 5,352,594, or Ps . 

2 5 aurcginosa or Ps . gi^rse, :: Ps . syrincae t or Ps . wiscor.sir.ensis (WO 
96/12012 from Soivay; or a strain of Bacillus sp., e.g. the 3. 
subziiis described by Dartois et ai., (1993) Siochemica et 
3iophysica acta 1131, 253-250, or 3. stearothermophilus (J? 
64/774 4992) or B, pumilus (WO 91/16422) or a strain of Streptomyces 

30 sp., e.g. S. scabies, or a strain of Chromobacterivm sp. e.g C. 
visoosum. 

In connection with the Pseudor.onas sp. lipases it has been found 
that lipases from the following organisms have a high degree of 

35 homology, such as at least 60% homology, at least 80% homology or at 
least' 90% homology, and thus are contemplated to belong to the same 
family of lipases: Ps. A7CC21808, Pseudomonas sp. lipase 
commercially available as Liposam®, Ps . aeruginosa EF2, Ps. 
aeruginosa PAC1R, Ps. aeruginosa PAOl, Ps. aeruginosa TE 3285, Ps. 

40 sp. 109, Ps. pseudoalcaligenes Ml, Ps. glumae, Ps. cepacia DSM 3959, 
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Ps. cepacia M-12-33, Ps. sp. KWI-56, Ps. putida IFO 3458, Ps. putida 
IFO 12049 (Gilbert, E. J., (1993), Pseudomonas lipases: Biochemical 
properties and molecular cloning. Enzyme Microb. Technol., 15, 634- 
645) . The species Pseudomonas cepacia has recently been reclassified 
5 as Burkholderia cepacia, but is termed Ps. cepacia in the present 
application. 

Also genes encoding lipolytic enzymes from yeasts are relevant/ ans 
include lipolytic genes from Candida sp., in "particular Candida 
10 rugosa, or Geotrichum sp. , in particular Geotrichum candidum. 

Specific examples of microorganisms comprising genes encodir.c 
lipolytic enzymes used for commercia i i y available products and which 
may serve as donor of genes z c be shuffled a c cor dir.:: c o th- 
15 invention include Hn~icoia la-^z-incs ~ , used in Lipoiase®, LipoiaseS 
Ultra, Ps . ner.cocir.a used ir. L.ur.a fast®, Ps . alcaiigenes used ir. 
Lipomax®, Fusari'jz sciani, Sarilius sp. (US 5427935, £? 523323), 
Ps. "encoz-ir.a , used ir. Liposam'S. 

2 0 Also the Pseudomonas sp. lipase gene shown in SZQ ID NO 14 are 
specifically concerr.oia ted according z z the invention. 



It is to be err.phas i zee that genes encoding lipolytic enzyme to be shuffled 
according to tr.e invention r.a y be any o f the above mentioned genes c : 
lipolytic er.cv~.es ar.c any variant, "edification, or truncation thereof. 
Ixarrpies of such genes vr.ich are specifically contemplated include the 
genes encoding the enzymes described m WO 92/0524 9, 'WO 94/01541, W0 
94/14 95:, WO 94/253*77, WO 95/22515 and a protein engineered lipase variants 
as described in Z? 407 225; a protein engineered Ps . mendocina lipase as 
describee in US 5,352,594; a cutinase variant as described in WO 94/14964; 
a variant of an Aspergi llus lipolytic enzyme as described in Z? d a t e n t 
167,309; and Pseudomonas sp. lipase described in WO 95/06720. 



A request to the DNA sequences, encoding the polypeptide (s) , to be 
35 shuffled, is that they are at least 60%, preferably at least 70%, 
better more than 80%, especially more than 90%, and even better up 
to almost 100% homologous. DNA sequences being less homologous will 
have less inclination to interact and recombine. 



40 



It is also contemplated according to the invention to shuffle parent 
(homologous) wildt type organisms of different genera. 
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Further, the DNA fragment (s) to be shuffled may preferably have a 
length of from about 20 bp to 8 kb, preferably about 40 bp to 6 kb, 
more preferred about 80 bp to 4 kb, especially about 100 bp to 2 kb, 
5 to be able to interact optimally with the opened plasmid. 

The method of the invention is very efficient for preparing po- 
lypeptide variants in comparison to prior art method comprising 
transforming linear DNA fragments /sequences . 

10 

The inventor found that the transformation frequency of a mixture of 
opened plasmid and a DNA fragment were significantly higher than 
when transforming a plasmid cut at the same site alone. The trans- 
formation frequency of the opened pi asm d and DNA fragment were as 
15 high as for. uncut pi asmid . 

Without being limited to any theory it is believed that the opening 
of the plasmid (s) restrict ( s ) the repiication of (opened) plasmid (s) 
when not interacting with at least cr.e DNA fragment. In accordance 
20 with this an increased number of re tcrr.dir.ed DNA sequences were found 
after only one shuffling cycle. 

As described in Example 1 50* of the resulting trans formants 
contained re combined DNA sequences of both input DNA sequences. As 
2 5 high as 20 s of the total number of recombined DNA sequences were 
"rand::," mixtures (i.e. having more than one region of nucleotides 
exchanged) . 

The input DNA sequences may be any DNA sequences including wild-type 
30 DNA sequences, DNA sequences encoding variants or mutants, or 
modifications thereof, such as extended or elongated DNA sequences, 
and may also be the outcome of DNA sequences having been subjected 
to one or more cycles of shuffling {i.e. output DNA sequences) 
according to the method of the invention or any other method (e.g. 
35 any of the methods described in the prior art section) . 

When using the method of the invention the output DNA sequences 
(i.e. shuffled DNA sequences), have had a number of nucleotide ( s ) 
exchanged. This results in replacement of at least one amino acid 
4 0 within the polypeptide variant, if comparing it with the parent 
polypeptide. It is to be understood that also silent mutations is 
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contemplated (i.e. nucleotide exchange which does not result in 
changes in the amino acid sequence) . 

However, the method of the present invention will in most cases lead 
to the replacement of a considerable number of amino acid and may in 
certain cases even alter the structure of one or more polypeptide 
domains (i.e. a folded unit of polypeptide structure). 

According to the present invention more than . two* ERA sequences are 
shuffled at the same time. Actually any number of different DNA 
fragments and homologous polypeptides, comprised in suitable plasmids 
may be shuffles at the same time." This is advantageous as a vast 
number of quite different variants car. be made rapidly without an 
abundance of iterative orocedures. 

The inventor have tested the nucleotide shuffling method of the 
invention using significantly -ore than two homologous DNA 
sequences. As described in Example 2 was surprisingly found that 
the method of the invention advantageously can be used for 
re combining more than two DNA sequences. 

One cycle of shuffling according to the .method of the invention mav 
result in the exchange of from 1 to 1000 nucleotides into the opened 
piasmid DNA sequence encoding the polypeptide in question. The 
exchanged nucleotide sequence { s ) may be continuous or mav be present 
as a number of sub-sequences within the full-length sequence { s ) . 

To support the present invention the inventor made a number of 
additional experiments on different aspect on the method of the 
invention. The experiments are described below and illustrated in 
the Example 3 to 6 below. 

A number of vectors and fragments comprising an inactivated 
synthetic Humicola lanuginosa lipase genes were constructed by 
introducing frameshi ft/stop codon mutations in the lipase gene at 
various positions. These were used for monitoring the in vivo 
recombination of different combinations of opened vector (s) and DNA 
fragments. The number of active lipase colonies were scored as 
described in Example 3. The number of colonies determines the 
efficiency of the opened vector (s) and fragment (s) recombination. 
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One frameshift mutation in said Humicola lanuginosa lipase gene in 
the opened vector and another in the fragment on the opposite side 
of the opening site gave 3 to 32% of active lipase colonies 
depending on the location and combination. It was concluded that 
5 the closer that the mutation is at the ends of the vector the 
higher mixing. 

One frameshift mutation in the opened vector and two in the 
fragment on each side of the opening site gave 4~~trb""42% of active 
10 colonies depending on the location and combination. Some of these 
active colonies can be considered to be mosaics, not only related 
to the opening site. 

Two frameshift mutations ir. the opened vector on each side of the 
15 opening size and one ir. "he rracr.er.: rave 0.5 to 3.1% of arrive 

colonies depend ir. g on the location and combination . Most c: these 
active colonies are mosaics of the "parent" DNA . 

Two frameshift mutations m the opened vector on each side of the 
20 opening site and a wild : y p e fragment gave 7.7 to 10.7% of active 
colonies d e o e n d i n c on the 1 c c a t i o n . 

It was also found that ire amount of vectors relative to fragments 
and the size of the fragments are also influencing the result. 

25 

Using of the 5. cerevisiae rad52 mutants as the recombination host 
ceil showed that the razz 2 mutant transformed very well with wild 
type piasmid(s) and expressed the Humiczia lanuginosa lipase gene, 
but gave no trans formants at ail with the opened vectors and 
30 fragments . 

The RAD52 function is required for "classical recombination" (but 
not for unequal sister-strand mitotic recombination) showing that 
the recombination of opened vector and fragment could involve a 
35 classical recombination mechanism. 

Classical recombination is the recombination mechanism involved in 
the recombination between genes located on nonsister chromatids of 
homologous chromosomes as defined in for example Petes TD, Malone 
40 RE and Symington LS (1991) "Recombination in Yeast", page 407-522, 
in The Molecular and Cellular Biology of the Yeast Saccharomyces , 
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Volume 1 (eds. Broach JR, Pringle JR and Jones EW) , Cold Spring 
Harbor Laboratory Press, New York. 

Multiple partially overlapping fragements 

The inventor also tested recombination of multiple partial 
overlapping fragments using the method of the invention. 

The recombination of 2 and 3 partial overlapping fragments into a 
gapped (i.e. that the opening result in cut ting" but" of a little 
part of the gene) vector were tested and gave a high recovery of 
recombined Humicola lanuginosa lipase gene. The recovery of active 
lipase gene from different combinations of inactivated Humicola 
lanuginosa genes was tested for the recombination of 2 partial 
overlapping fragments. The tendency was a higher mixing — the 
overlapping region between the 2 fragments in the gapped region 
than in the ve:::: and fragment overlap. 

When reclining many fragments from the same region, the multiple 
overlapping fragment technique will increase the mixing by itself, 
but it is also important to have a relative high random mixing in 
overlapping regions m order to mix closely located 
variants/differences . 

An overlap as small as :v bp between two fragments were found to be 
sufficient obtain a very efficient recombination. Therefore, 
overlapping in the range from 5 to 50CD bp, preferably from 10 bp to 
500 bp, especially 10 bp to 100 bp is suitable according to the 
method of the invention. 

According to this embodiment of the present invention 2 or more 
overlapping fragments, preferable 2 tc 6 overlapping fragments, 
especially 2 to 4 overlapping fragments may advantageously be used 
as input fragments in a shuffling cycle. 

Besides increasing the mixing of genes, this is a very useful 
method for domain shuffling by creating small overlaps between DNA ' 
fragments from different domains and screen for the best 
combination . 

Tor instance, in the case of three DNA fragments the overlapping 
regions may be as follows: 
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- the first end of the first fragment overlaps the first end of the 
opened plasmid, 

- the first end of the second fragment overlaps the second end of 
the first fragment, and the second end of the second fragment 

5 overlaps the first end of the third fragment, 

- the first end of the third fragment overlaps (as stated above) the 
second end of the second fragment, and the second end of the third 
fragment overlaps the second end of the opened plasmid. 

10 It is to be understood that when using two or more DNA fragments as 
starting material it is preferred to have continuos overlaps between 
the ends of the plasmid and the DNA • fragments . 

Ever, though i t is preferred to shuffle homologous DNA sequences in 
15 the form of DNA f ragmen t ( s ) and tpened piasmid(s), it is also 
contemplated according to the ir.ver.ncr. to shuffle two or more 
opened plasmas comprising hem:; 1c zz -s DNA sequences encoding 
polypeptides. However, in such esse ;t is compulsory to open the 
plasties a: different sites. 

20 

In an further embodiment of the invention two or more opened 
piasmids and one or more homologous ZNA fragments are used as the 
starting material to be shuffled. The ratio between -the opened 
plasmid (s) and homologous DNA fragment { s) preferably lie in the 
25 range from 20:1 to 1:50, preferable fr:m 2:1 to 1:10 (mol vector :mol 
fragments ) with the specific concentrations being from 1 pM to 10 M 
of the DNA. 

The opened piasmids may advantagously be gapped in such a way that 
30 the overlap between the fragments is deleted in the vector in order 
to select for the recombination ) . 
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Preparing the DNA fragment 

The DNA fragment to be shuffled with the homologous colypeptide 
comprised in an opened plasmid may be prepared by anv suitable 
method. For instance, the DNA fragment may be prepared by PC?. 
5 amplification (polymerase chain reaction), as described above, of a 
plasmid or vector comprising the gene of the polypeptide, usinc 
specific primers, for instance as described in US 4,683,202 or Saiki 
et al., (1988), Science 239, 487 - 491. The DNA fragment nay also be 
cut out from a vector or plasmid comprising the^desired DNA seauence 
10 by digestion with restriction enzymes, followed by isolation usinc 
e.g. electrophoresis. 

The DNA fragment encoding the homologous polypeptide in cuestion mav 
alternatively be prepared synthetically by established standard * 

15 methods, e.g. the phosphoamidi te methtd describee by Beaucace and 

Caruthers, (1931), Tetrahedron Letters 22, 1 : 59 - 1 8 69, or the ■■ 
method described by Macthes et al., ;i9B4i, Z M B 0 Journal 3, 301 - " 
605 . According to the phosphoamidi te method, oligonucleotides are 
synthesized, e.g. in an au::~a:;: DNA synthesizer, ourified, 

20 annealed, iigated and cloned in suitacie vectors. 

Furthermore, the DNA fragment may be : : nixed synthetic and genomic, ' 
mixed synthetic and cDNA or mixed generic and cDNA origin prepared 
by iigatinc fragments of synthetic, genomic or cDNA origin (as 
25 appropriate), the fragments corresponding to various parts of the 
entire DNA sequence, in accordance with standard techniques. 

The plasmid 

The plasmid comprising the DNA sequence encoding the polypeptide in 
question may be prepared by Iigatinc said DNA sequence into a 
suitable vector or plasmid, or by any other suitable method. 

Said vector may be any vector which may conveniently be subjected to 
recombinant DNA procedures. The choice of vector will often depend 
cn the recombination host ceil into which it is to be introduced. 

Thus, the vector may be an autonomously replicating vector, i.e. a 
vector which exists as an extrachromosomal entity, the replication 
of which is independent of chromosomal replication, e.g. a plasmid. 
40 Alternatively, the vector may be one which, when introduced into the 
recombination host cell, is integrated into the host cell genome and 
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replicated together with the chromosome (s) into which it has been 
integrated. 

To facilitate the screening process it is preferred that the vector 
is an expression vector in which the DNA sequence encoding the 
polypeptide in question is operably linked to additional segments 
required for transcription of the DNA. In general, the expression 
vector is derived from a plasmid, a cosmid or a bacteriophage, or 
may contain elements of any or all of these. 

The term, "operably linked" indicates that the segments are arranged 
so that they function in concert f-or their intended purposes/ e.g. 
transcription initiates in a promoter and proceeds throurh the DNA 
sequence coding for the colyoeptide in Question. 

The promoter may oe any DNA sequence which shows transcriptional 
activity in the re combine c i on host cell of choice and may re derived 
from genes enccdir.c proteins, such as enzymes , either hcmoiocous or 
heterologous to the host cell. 

Examples of suitable promoters for use in yeast host cells include 
promoters from yeast glycolytic genes (Kitzeman et ai.,{1980), J. 
Biol. Chem. 255, 12075 - 12050; Aiber and Kawasaki, (1932), J. Mol. 
Appi . Gen. 1, 4 19 - 4 34) or alcohol dehydrogenase genes {Young et 
al., in Genetic Engineering of Microorganisms for Chemicals 
(Hoiiaender et ai, eds . ; , Plenum Press, New York, 1932), or the TPI 1 
(US 4,599,311) or ADH2-4c (Russell et al., (1983), Nature 5j4, 652 - 
654) promoters. 

Examples of suitable promoters for use in filamentous fungus host 
cells are, for instance, the ADH3 promoter {McKnight et al., (1985), 
The SM30 J. 4, 2093 - 2099) or the tpiA promoter. Examples of other 
useful promoters are those derived from the gene encoding A. oryzae 
TAKA amylase, Rhizomucor miehei aspartic proteinase, A. niger neu- 
tral a-amylase, A. niger acid stable a-amylase, A. niger or A. 
awamori glucoamylase (gluA) , Rhizomucor miehei -lipase, A. oryzae 
alkaline protease, A. oryzae triose phosphate isomerase or A. 
nidulans acetamidase. Preferred are the TAKA-amylase and gluA 
promoters ; 
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The DNA sequence encoding polypeptide in question invention may 
also, if necessary, be operably connected to a suitable terminator, 
such as the human growth hormone terminator (Palmiter et al., op. 
cit . ) or (for fungal hosts) the TPI1 (Alber and Kawasaki, op. cit . ) 
5 or ADH3 (McKnight et al., oo^ cit. ) terminators. The vector may 
further comprise elements such as polyadenylation signals {e.g. from 
SV40 or the adenovirus 5 Elb region), transcriptional enhancer 
sequences (e.g. the SV40 enhancer) and transiational enhancer 
sequences (e.g. the ones encoding adenovirus VA ~RNAs-) . 

10 

The vector may further comprise a DNA sequence enabling the vector 
to replicate in the recombination host cell in question. 
When the host cell is a yeast ceil, suitable sequences enabling the . 
vector t z replicate are the yeast plasrr.id 2m .replication genes RI? 
15 1-3 and origin of replication. 

The piasmid p'il car. be used fcr production c f useful proteins and 
peptides, using fiiar.er.tous : ur.gi , such as Aspergillus sp , , and 
yeasts as reccmbinan t hcst ceils ; ~? T c24 5777-A) . 

20 

The vector may aisc c emprise a selectable marker, e.g. a gene the • 
product cf which co~pier.er.es a defect in the recombination host 
ceil, such as the ger.e coding for dihydrof oiate reductase ( DKFR) or 
the Szhizosaczr.arc-yces centre T?I gene (described by P.R. Russell, 
25 (1935} , Gene 4C, 125-130; . 

Another example cf such suitable selective markers are the ura3 and 
ieu2 genes which complements the corresponding defect genes of e.g. 
the yeast strain Saccha zorryces cerevisiae YNG318. 

30 

The vector may also comprise a selectable marker which confers 
resistance to a drug, e.g. ampicillin, kanamycin, tetracyclin, 
chloramphenicol, neomycin, hygromycin or methotrexate. For fi- 
lamentous fungi, selectable markers include amdS , pyrG , aroB, niaD , 
35 sC, trpC , pyr4 , and DHFR . 

To direct the polypeptide in question into the secretory pathway of 
the recombination host cell, a secretory signal sequence (also known 
as a leader sequence, prepro sequence or pre sequence) may be 
40 provided in the recombinant vector. The secretory signal sequence is 
joined to the DNA sequence encoding the lipolytic enzyme in the 
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correct reading frame. Secretory signal sequences are commonly 
positioned 5* to the DNA sequence encoding the polypeptide. The 
secretory signal sequence may be the signal normally associated with 
the polypeptide in question or may be from a gene encoding another 
5 secreted protein. 

The signal peptide may be naturally occurring signal peptide, or a 
functional part thereof, or it may be a synthetic peptide. For 
secretion from yeast cells, suitable signal peptides have been found 
10 to be the a-factor signal peptide (cf. US 4,870,008), the signal 
peptide of mouse salivary amylase (cf. O. Hagenbuchle et al . , 
(1981), Nature 289, 643-646), a .modified carboxypeptidase signal 
peptide (cf. L.A. Vails et al . , (1987), Cell 48, 887-897), the 
Hunicola lanuginosa lipase signal peptide, the yeast 3AR1 signal 
15 peptide { c f . WO 57/02670), or the yeast asparcic protease 3 (YAP3) 
signal peptide (cf. M . Egei-Xi tan: et al., (1990), Yeast 5, 127- 
137) . 

For efficient secretion in. yeast a sequence encoding a leader 
peptide ma y also be inserted c z w n s t r e a m cf* the signal se cue nee and 
upstream of the DNA sequence encoding the polypeptide in question. 
The function of the leader peptide is to allow the expressed 
polypeptide to be directed f rem the endoplasmic reticulum to the 
Goigi apparatus and further to a secretory vesicle for secretion 
into the culture medium [i.e. exportation of the polypeptide across 
the cell wall or at least through the cellular membrane into the 
peripiasmic space cf the yeast cell). The leader peptide may be the 
yeast a - factor leader (the use of which is described in e.g. US 
4,546,082, E? 16 201, E? 123 294, E? 123 544 and E? 163 529). 
Alternatively, the leader peptide may be a synthetic leader peptide, 
which is to say a leader peptide not found in nature. Synthetic 
leader peptides may, for instance, be constructed as described in WO 
89/02463 or WO 92/11378. 

For use in filamentous fungi, the signal peptide may conveniently be 
derived from a gene encoding an Aspergillus- sp. amylase or 
glucoamyiase, a gene encoding a Rhizomucor miehei lipase or 
protease, a Humicola lanuginosa lipase. The signal peptide is 
preferably derived from a gene encoding A . oryzae TAKA amylase, A. 
niger neutral a-amylase, A. niger acid-stable amylase, or A . niger 
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glucoamylase . 

The recombination host cell 

The recombination host cell, into which the mixture of plas- 
5 mid/fragment DNA sequences are to be introduced, may be any 
eukaryotic cell, including fungal cells and plant cells, capable of 
recombining the homologous DNA sequences in question. 

According to prior art prokaryotic microorganisms?-- such as bacteria 
10 including Bacillus and E. coli; eukaryotic organisms, such as 
filamentous fungi, including Aspergillus and yeasts such as 
Saccharomyces cerevisiae; and tissue culture cells from avian or 
mammalian origins have beer, suggested for ir. vivo recombination. All 
of said organisms can be used as recombination host ceil, but in 
15 general prokaryotic cells are r.o: sufficiently effective (i.e. does 
no: result in a sufficient number z f variants) to be suitable for 
recombination methods for industrial use. 

Consequently, preferred reccmbir.atirr. host cells according to the 
20 present invention are fungal ceils, such as yeast ceils or filament- '"- 
ous fungi . 

Examples of suitable yeast ceils include ceils of Saccharcmyces sp. ( 
in particular strains of Saczha romyces cerevisiae or Saccharomyces 

25 kluyveri or Schizosaccha rc-yces sp . , Methods for transforming yeast 
ceils with heterologous DNA and producing heterologous polypeptides 
therefrom are describee, e.g. in -J 5 4,599,311, US 4,931,373, US 
4,370,008, 5,037,743, and US 4,345,075, all of which are hereby- 
incorporated by reference. Transformed cells may be selected by, 

30 e.g., a phenotype determined by a selectable marker, commonly druc 
resistance or the ability to grow in the absence of a particular 
nutrient, e.g. leucine. A preferred vector for use in yeast is the 
POT1 vector disclosed in US 4,931,373. The DNA sequence* encoding the 
polypeptide may be preceded by a signal sequence and optionally a 

35 leader sequence, e.g. as described above. Further examples of 
suitable yeast cells are strains of Kluyveromyces , such as K. 
lactis, Hansenula, e.g. H. polymorphs, or Pichia, e.g. P. pastoris 
(cf. Gleeson et al.,(1986), J. Gen. Microbiol. 132, 3459-3465; US 
4, 862,279) . 

40 
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Examples of other fungal cells are cells of filamentous fungi, e.g. 
Aspergillus sp., Neurospora sp., Fusarlum sp. or Trichodenna sp., in 
particular strains of A. oryzae, A. nidulans or A, niger. The use of 
Aspergillus sp. for the expression of proteins is described in, 
5 e.g., EP 272 277, EP 230 023. The transformation of F. oxyspcrwr. 
may, for instance, be carried out as described by Malardier et al 
C 1989) , Gene 78, 147-155. 

In a preferred embodiment of the invention the- jrecombi nation host 
) cell is a cell of the genus Saccharomyces, in particular 5, 
cerevisiae . 



METHODS AND MATERIALS 

DNA secuence : 

H^icola lanuginosa DS>: HZ* derived lipase encoding DNA sequence. 
Xunicoia lanuginosa lipase variants: 

Variants used for prssarins vectors to be opened with Nrul in 

Examole 2: 



(a) E5SR, D57L, ~90r, D95L, 15?;-; 

(b) E5SR, D57L, V£0X, D52N, S52T, 055?, DIC2E 

(c) D57G, N94K, D95L, L97X 

(d) E37K f G91A, D96-, 1 1 00 V , £12 9 >' , r'2 3 7M , I252L, ?256T, G263A, L254Q 

(e) E5SR, D57G # S5 5F, D52C, £c7G, GSiA, r 95L, D96P, K93I, {K237M) 



Variants used for preparing DMA fr agents bv standard PG R 
amplification in Example 2: 

(g) SS3T, N94K, D95N 

(h) E87K,D96V 

(i) N94K,D96A 

(j) E97K,G91A, D96A 
(k) D167G,E210V 
(1) S83T,G91A,Q249R 
(m) E87K,G91A 

(n) S83T,E87K,GS1A,N94K, D96N, Dl UN . 

(o) N73D,E87K,G91A,N94I, D96G. 

(p) L67P, I7 6V,S8 3T,E87N, I90N,G91A, D9 5A,K98R. 
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(q) S83T,E87K,G91A,N92H,N94K, D96M 

(s) S85P,E87K,G91A, D96L,L97V. 

(t) E87K, I90N,G91A,N94S, D96N, T100T. 

(u) I34V,S54P, F80L, S85T, D96G, R108W, G109V, D111G, S116P, L124S, 
5 V132M,V14 0Q,V141A, Fl 42S , HI 4 5R, Nl 62T, 1 166V, F181P, F18 3S, 

R205G, A243T, D254G, F262L. 
(v) E56R, D57L, I90F, D96L,E99K 
(x) E56R, D57L, V60M, D62N / S83T / D96P, D102E 
(y) D57G / N94K, D96L, L97M 
10 (z) E87K,G91A, D96R, I100V, E12 9K, K237M, I252L, P256T,G263A, L2 64Q 
(aa) E56R, D57G, S53F, D62C , 764R, E8 7G, G91A, F95L, D96P,K98I 

Strains : 

Expression s vs:eT. host: 
15 Saccharor/yces cezevisise YNG2 1 3 : = Dpep4(cir"; ura3-52, ieu2-D2, 

his 4-539 

Ssccha ro~y oe 5 r e r -a vi s i a e P.ad52 : Szr-ir. '-'1 533 = --lA Ta ra d 5 2 ura3 , 
obtained fro:?. Trrster. N'iisscr. Tillgrer. , Institute c f Genetics, 
University of Ccper.harer. . 

20 

Plasrruds : 

pJ502 6 (see figure 2; 
pJS037 {see figure A] 
pYES 2.0 (Invitrogen; 

2 5 

ura3 
ieu2 

30 Media 

SC-ura": 90 r?;i 10 x 3asai salt, 22.5 r.l 20% casanuno acids, 9 nil 1% 
tryptophan, H : 0 ad 805 ~i, autoclaves. 3.5 ml 5% threonine and 90 ml 
20% glucose or 20% galactose added. 

LB-medium: 10 c Bacto- tryptor.e , 5 g Bacto yeast extract, 10 g NaCi 
35 in 1 litre water. 

Brilliant Greer. (3G) (Merck, art. No. 1.01310) 

BG-reagent: 4 mc/ml Brilliant Green (3G) dissolved in water 

Substrate 1: 

10 ml olive oil (Sigma CAT NO. 0-1500) 
40 20 ml 2% polyvinyl alcohol (PVA) 
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The Substrate is homogenised for 15-20 minutes. 
Methods : 

5 Construction of yeast expression vector 

The expression plasmids pJS026 and pJS037, are derived from pYES 
2.0. The inducible GALl-promoter of pYES 2.0 was replaced with the 
constitutively expressed TPI (triose phosphate isomerase) -promoter 
from Saccharomyces cerevisiae (Albert and Karwasa-ki^-- ( 1982 ) , J. Mol . 
10 Appl Genet., 1, 419-434); and the ura3 promoter has been deleted. A 
restriction map of pJS026 and pJS037 is shown in figure 3 and figure 
4 / resDectivelv . 
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400 c 




Ami case 






6.7 z 




yeast extract ;Difcc; 






12 . 5 c 




L-Leucin (fiu-:a) 






c.7 g 




(NH,) 2 S0 4 






10 g 




MgSO ; -7H 2 0 




30 


17 g 




K 2 S0 4 






10 ml 




Trace compounds 






5 ml 




Vitamin solution 






6.7 mi 




H3PO4 






25 ml 




20% Piuronic (antifoam) 





35 

In a total volume of 5000 ml: 

The yeast cells are fermented for 5 days at 30°C. They are given a 
start dosage of 100 mi 70% glucose and added 400 mi 70% glucose/day. 
A pH=5.C is kept by addition of a 10% NH 3 solution. Agitation is 300 
4 0 rpm for the first 22 hours followed by 900 rpm for the rest of the 
fermentation. Air is given with 11 air/l/min for the first 22 hours 
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followed by 1.5 1 air/l/min for the rest of the fermentation. 



Trace comDounds: 


6.8 g 


ZnCl 2 


z>h . u g 


FeCl?* 6H2O 


19 . 1 g 


MnCl^ ' 4H-)0 


2.2 q 


CuSCL '5H-.0 


2.58 g 


COCl 2 


0 . 62 g 


H3BO3 


0.024 g 




0.2 c 


KI 


100 ml 


HCl fconce^r- 


In a t c 1 


ra 1 volume 0 ' "f 




solution : 


2 50 mg 


3iotin 


3 g 


Thiamin 


10 c 


D-Caiciumpant; 


100 g 


Myo-Inositoi 


50 g 


Cholmchiorid 


1.5 g 


Pyridoxin 


1.2 e 


Niacinamid 


C.4 g 


Toiicacid 


C.4 c 


Riboflavin 


In a lot 


ai volume of 11 



Transformation of yeast 

Saccr.aromyces cerevisiae is transformed by standard methods (cf. 
Sambrooks et ai., (193 9) , Molecular Cloning: A Laboratory Manual, 
2nd Ed., Cold Spring Harbor) 

Determination of yeast transformation frequency 

The transformation frequency is determined by cultivating the 
transformants on SC-ura'plates for 3 days and counting the number of 
colonies appearing. The number of transformants per mg opened 
piasmid is the transformation frequency. 

Screening for positive variants with improved wash performance 

The following filter assay can be used for screening positive 

variants with improved wash perf ormance . 
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Low calcium filter assay 

1) Provide SC Ura" replica plates (useful for selecting strains 
carrying the expression vector) with a first protein binding filter 

5 (Nylon membrane) and a second low protein binding filter (Cellulose 
acetate) on the top. 

2) Spread yeast cells containing a parent lipase gene or a mutated 
lipase gene on the double filter and incubate for 2 or 3 days at 
30°C. ----- 

10 3) Keep the colonies on the top filter by transferring the top- 
filter to a new plate. 

4) Remove the protein binding filter to an empty petri dish. 

5) Pour an agarose solution comprising an olive oil emulsion {21 
?VA: olive oii=3:l), Brilliant green (indicator, 0.004%) , 100 mM tris 

15 buffer pHS a:.:' 1ST A (final c oncer, t rat : or. 5mM) on the bottom filter 
so as to identify colonies expressing lipase activity in the form of 
blue-green spots. 

6) Idencifv colonies found in seep 5 having a reduced zeper.dency 
for calcium as compared :: the parent lipase. 

20 

DNA sequencing was performed by using applied Biosystems A3I DMA 
sequence model 373A according to the protocol in the A3 1 Dye 
Terminator Cycle Sequencing kit, 

2 5 Assessing the effiencv cf recombination 

The number of colonies determines tne efficiency of the opened 
vector and fragment recombination. The percentage of colonies with 
active lipase activity gives an estimate of the mixing of the 
active and inactive genes - theoretically it can be calculated for 

30 one frameshift that the closer to 50* the better mixing if equal 
likelihood of wild type and frameshift, 25% for 2 frameshifts and 
12.51 for 3 frameshifts. 

Frameshift mutation 

35 The frameshift mutation were created either by filling in a 
restriction site (in case of 5' overhang) or deleting the "sticky 
ends" (in case of 3' overhang) by T4 DNA polymerase with or without 
dNTP (deoxynucleotides = equal amounts of dATP, dTTP, dCTP and 
dGTP)\ Methods for filling in of restriction sites (referred to as 

40 "F" on Figure 7) and deleting the sticky ends (referred to as 
"(D)" on Figure 7) are well known in the art. 
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Method for assessing colonies with lipase activity 

The number of colonies and positives (i.e. with lipase activity) are 
calculated as the average of 3 plates. 
5 The cultivation condition and screening condition used is the 
following: 

1) Provide SC Ura-plates with a protein binding filter (Nylon 
filter) onto the plate. 

2) Spread yeast cells containing a parent lipase gene or a mutated 
10 lipase gene on the filter and incubate for 3 or 4 days at 30°C. 

3) Remove the protein binding filter with the colonies to a petri 
dish containing: An agarose solution comprising an olive oil 
enuision (2* ?VA:01ive o i 1 = 2 : 1 5 . Brilliant green ( indi ca tor , 0 . 004 \) , 
100 rr.M tris buffer pK 9 . 

15 5) Identify crionies expressing lipase activity ir. the form of biue- 
creer. spots. 

EXAMPLES 

20 

Example 1 

Testing in vivo re combi r.a t ion of two homologous genes 
The Saccharcr.yces oezeviszae expression piasmid pJS02 5 was 
2 5 constructed as described above in the "Material and Methods " - 
section. 

A synthetic H'j^ioola ia r.i-gir.osa lipase gene (in pJS037) containing 
12 additional restriction sites {see figure 4) was - cut with Nrul, 
30 ?st I , and Nrul and ?stl, respectively, to open the gene 
approximately in the middle of the ON A sequence encoding the lipase. 

The opened piasmid (pJ5037) was transformed into Saccharomyces 
cerevisiae VNG318 together with an about 0.9 kb wild-type Humicola 
35 lanuginosa lipase DNA fragment (see figure 1) prepared from pJS02 6 
by PGR amplification. 

Further, the opened piasmid was also transformed into the yeast 
recombination host cell alone {i.e. without the 0.9 kb synthetic 
40 lipase DNA fragment) . 



The transformed yeast ceils were grown as described in the "Ma- 



WO 97/07205 



28 



PCT/DK96/00343 



terials and Method"-section above, and the transformation frequency 
was determined as described above. 

It was found that the transformation frequency of the opened plasmid 
alone was very low {10 transf ormants per mg opened plasmid), in 
comparison to the transformation frequency of said plasmid/f ragment 
(50,000 transformants per mg opened plasmid). 

The plasmid/f ragment was PCR amplified resulting— ot~2-0 transformants 
containing fragments covering the lipase gene region of the 
recombined piasmid/f ragments . The recombination mixture of the 20 
transformants were analyzed by restriction site digestion usinc 
standard methods. The result is displayed in Table 1. 

Table 1 

Nrul (no: tested) 



PCR 






: ?s:: 


5 s z x ; 








Xhol 


f ragr 


p.er.t 
















p i 






























w t 




w t 


?2 


sg 


s z 


sg 






w t 


w t 


wt 


?3 


sg 


sg 


sg 


s z 




sg 


sg 


nd 


P4 


nd 


sg 


sg 






wt 


• nd 


nd 


?5 












wt 


wt 


wt 


?5 


sg 


sg 


sg 


sg 


5 ~ 


sg 


sg 


nd 


Nl 


W ^ 








s r 


wt 


wt 


wt 


N2 


w t 






wt 






w t 


wt 


N3 












wt 


w t 


w* t 


N4 


sg 


sg 


s z 


w t 






w t 


w t 


N5 


sg 


s g 


sg 


w t 




wt 


w t 


wt 


No 


wt 






sg 


sc 


sg 


sg 


sg 


P/Nl 


sg 


sg 


sg 


wt 




wt 


wt 


wt 


P/N2 


sg 


sg 


sg 


sg 


sg 


sg 


sg 


nd 


P/N3 


sg 


sg 


sg 


wt 


nd 


sg 


sg 


sg 


P/N4 


sg 


sg 


sg 


sg 


sg 


sg 


sg 


nd 


P/N5 


sg 


sg 


sg 


sg 


sg 


sg 


sg 


nd 


P/N6 


sc 


sg 


sg 


wt 




sg 


sg 


sg 


P/N7 


nd 


wt 


wt 








nd 


wt 


P/N8 


sg 


sg 


sg 


wt 




wt 


sg 


nd 



P: plasmid opened with PstI 
N: Plasmid opened with NRuI 

P/N: plasmid opened with PstI and NRuI (resulting in the removal of 
a 75 bp fragment) 

wt: wild-type gene restriction enzyme pattern 
sg : synthetic gene restriction enzyme pattern 
nd: not determined 



As can bee seen from Table 1 10 transformants (equivalent to 501) 
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contained recombined DNA sequences. 4 of these 10 DNA sequence 
(equivalent to 20%) contained either a region of the wild-type gen 
recombined into the synthetic gene or a region of the synthetic gene 
recombined into the wild-type fragment. 

Example 2 

In vivo recombination of Humicola lanuginosa lipase variants 

The DNA sequences of 20 variants of the Humicola 'lanuginosa lipase 

were in vivo recombined in the same mixture. 

Six vectors were prepared from the* lipase variants (a) to ( f ) (see 
the list above) by I i gat i or. into the yeast express: or. vector c JS037 . 
Ail vectors were cut open with N r u 1 . 

DNA fragment of ail 20 homologous DNA sequences (g ) :o (aa) {see the 
list above) were prepared by ?CR amplification using standard 
me t n o ds 

The 20 DNA fragments and the £ opened vectors were mixed and 
transformed into the yeast 5a cc'r.a rzrr.yces cerevis iae :~NG3 1 3 bv 
standard methods. The recombination host ceil was cultivated as 
described above and screened as described above. About 20 trans- 
fcrmants were isolated and tested for improved wash performance 
using the filter assay method described in the "Material and 
Methods " -sect ion . 

Two positive trans formants {named A and 3) were identified using the 
filter assay. 

In comparison to the wild-type amino acid sequence the two re- 
combined positive transf ormants had the following mutations. 

A: D57G, N94K, D96L, P255T 

A is^a recombination of two variants. 
originates from the vector (d) 

===== originates from the DNA fragment prepared from variant (y) 

B: D57G, G59V, N94K, D96L, L97M, S116P, S170P, N249R 
???? <<<<< ????? ===== 

B is a recombination of vector (c) , DNA fragments (n) and (u) . 
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originates from the vector (c) 

<<<< originates from the DNA fragment prepared from variant (u) 
===== originates from the DNA fragment prepared from variant (n) 
???? Amino acid mutation which is not a result of recombination. 

As can be seen the resulting positive variants have been formed bv 
recombination two or more variants. The amino acid mutations marked 
"?????" are not a result of in vivo recombination, as none of the 
shuffled lipase variants (see the list above) comprise any of said 
mutations. Consequently, these mutations are a result * of random 
mutagenesis arisen during preparation of the DNA fragments bv 
standard PGR amplification. 



I x arr.pl e 3 

R e c c mb in a c i c r. w i t h o r. e f r a me 5 h i f c rr.u t a r. c i o n s 

Synthetic ^umicoia lanuginosa lipase gene (in ve::o: J303"") was 
made inactive at various positions by deleting (positions 134/335) 
or filiinc-in (position 2 5 3 /3 1 7 / 5 I 5 / T 5) restriction enzyme sites 
or by site-directed introduction of a stop codon. All inactive 
synthetic lipase genes cf 9 ID bp can be deduced from Figure 7). 

A number of different SZZ bp DN'A fragments were made from the above 
vectors using primer 46rr and primer 5164 using standard ?CR 

ents were made using primer 8 4 5 7 and 
2343 and primer 454 B (433bp). 

0.5 mi (app. 0.1 rr.g) of vectors Blue 425, Blue 425; Blue 428 and 
Blue 425, opened with ?st I (i.e. position 335), vectors Blue 424 
and 31ue 425 opened with Nrul (i.e. pcsition 464) were together 
with 3 mi (app. 0.5 mg) of fragments 424, 425, 426, 428, 429 in 
varios combination transformed into 100 mi Sacchromyces cerevisiae 
YNG313 competent cells as displayed in Table 1A. 

The number of colonies and positives (i.e. with lipase activity) 
were calculated as the average of 3 plates as described in the 
Material and Methods section. 

The result of the test is shown in Table 1A 
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Table 1A 



vector + Fragment 




Nuniier of 


% of colonies with active 








colonies 


lipase activity 


1. 


Slue 428 + 429° 


774 


16% 


2. 


Blue 429 + 425# 


645 


3% 


3. 


Slue 426 + 425$ 


276 


25% 


4. 


Blue 425 + 426 


528 




5. 


Slue 425/Nru I 






539 


28% 




+ 42b 










5. 


Slue 425 + 424 


139 


7% 


7, 


Slue 424/NruI + 






74 


32% 




425 = 












SlL'e 425 - 4 25 j 






SI ■ ! 


12% 




SiL'e 423 - wz ! 








37% 




f ragmen: j 










Pair 


wise :eco"^::.a::c 


ns 


of 


one frame sn 


~ — ~ llili LU .» <^ . . \J t i Lilt. (S'^wOr 


and 


another en the fr 


agmen t 


on the epp 


osite side of the opening 


site 


. = te:err:re: :v 


9 


pia 


tes; " cete 


rmined by 5 plates. 


The 


first 2 rows cf T 


able 1 


A displays 


vectors and fragments with a 


fram 


e shift on each si 


ce 




the ?stl si 


te. The "mirror image" 


expe 


rimer.t in row 2 c 




pare 


z to row 1 


gives a reproducible lower 


number cf active celt 




es . 


The same is 


true for row 3 and 4 even 


though it is not as p 


r o n o ' j n 


ced. Moving 


the opening site closer to 


the 


frameshift in the 




e ct c 


r increases 


the number of actives as 






explain the re 


as on for the difference in 



the "mirror image" experiments. In both cases the higher number of 
positives has the opening site closer to the frame shift in the 
vector . 

It can therefore be concluded that the closer the mutation is to 
the end of the vector the higher chance of mixing. This is probably 
arising from the well known fact that free DNA ends have a high 
recombinogenic potential. Therefore it is desirable to have as many 
free DNA ends as possible to increase the mixing of the genes. This 
is for example obtained in the later example with recombination of 
multiple overlapping fragments. 

Row 6 has a rather low number of actives probably due to the 
location of the frameshift on the fragment exactly at the ?stl 
opening site of the vector. 
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Row 7 has the frameshift of the vector close to the opening site 
and again it gives a high number of actives. 

Recombination with one stop codon mutantions 

In order to test if there are any difference in the recombination 
efficiency of stop codon mutations compared to frameshift mutations 
the following experiments were made.. 

The same way as described above 0.5 ml (app. 0.1 mgT" vectors Blue 
624, Blue 625 and Blue 626 (see Table IB) opened with PstI 
comprising stop codons at specified positions (positions 184, 317 
and 746, respectively) (perpared by site-directed mutagenesis) were 
together with 2 ml (app. 0.5 mg) or fragments 624, 625 and 626 
transformed into 100 r?.l Sacchrovyces cerevisiae YNG31 9 competent 
ceils in varies combination as displayed in Table 13. 



Table i ~ 



Vector r 
Fragment 


Nuizber of 
colonies 


i of colonies with lipase 
activi ty 


1 . Blue 62 6 - 
624 


ND 


40% 


2. Slue 624 r 
626 


ND 


12% 


3. Blue 625 r 
624 


ND 


75% 


4. Blue 624 - 
625 


ND 


10% 



Pairwise recombinations of one stop croon mutation on the vector 
and another on the fragment on the opposite side of the opening 
site. ND = not determined but a high number. 



Row i and 2 (in Table 13) have the mutations located at the same 
place as row 1 and 2 in Table 1A. As can be seen the number of 
colonies with lipase activity is clearly higher for the stop codon 
mutations compared to the frameshift mutations, but the same 
relative difference between the "mirror image" experiments. 

This might indicate that the stop codon mutations, which is closer 
to the "application" of the method, gives a better mixing than 
frameshift mutations. Row 3 and 4 confirms that the closer the 
mutation is to the end of the vector the higher chance of mixing. 

Recombination with one or two frameshift mutation in the vector 
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and one or two frameshift mutations in the fragment 

Using the same approach as described above the influence of one or 
two frameshift mutations in the vector and one or two frameshift 
mutations in the fragment were tested using vectors Blue 425, 426 
and 428 (one mutation) and vectors Blue 442, Blue 443 (two 
mutations) and fragments 442 and 443 (two frameshift mutations) 
and fragments 424, 425, 426, 427, 428 (one mutation) and wild-type 
{no mutation) 

The vectors Blue 442 and 443 are double frameshift mutations: Blue 
442 = 428 + 429 and blue 4 4 3 = 4 27 + 4*2 9 (see Figure 7) . 

?-5~~~bir ; a::cr. was performed by t rans f z rrr.i no 0.5 mi vec:c: ( app . 0.1 
mr) opened with ?stl and 3 mi ?C=- fragment (app. 3.5 mg ) into 100 
ml 5a cchzoxyces ce rev i side TN'G 2 1 5- z z moe tent ceils. 

The result cf the test is shown in Table 2 A and Table 23 



Table 2.-. 



Vector * | Numuoe r of I % zf colonies with active 
fragment ! colonies j Lipoiase 


1. 3iue 425 + 
442 


142 | 15% 
i 


2. Blue 425 - i 144 
4 4 3 | 


14% 


3. Blue 42 5 + 
442 


42 


42% 


4 . Blue 426 r 
443"? 


77 


20% 


5. Blue 423 + 
443 


115 ■ 


3.8% 



One frameshift mutation on the vector and two on the fragment on 
each side of the opening site. # determined by 6 plates. 



Table 23 



Vector + Fragment 


Number of 
colonies 


1 of colonies with active 
Lipoiase 


31ue 442 + 424 


137 


0.5% 


Blue 442 + 426 


118 


1.1% 


Blue 442 + 427"? 


125 


1.3% 


Blue 443 + 425 


540 


2.5% : 
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Blue 443 + 426 


196 


1.5% 


Blue 443 + 428 


469 


3.1% 


Blue 442 + wt 
fragment 


135 


7.7% 


Blue 443 + wt 
fragment 


488 


10.7% 



Two frameshift mutations on the vector on each side of the opening 
site and one on the fragment. # determined by 6 plates. 



Table 2A shows a rather high number of colonies with lipase 
activity even with a total of 3 frameshifts (but only one 
frameshift on the vector) except for -the last row where the 
frameshift on the vector is located far from the opening site. Lane 
4 has fewer actives than lane 5 probaoiy due to that the f rar.es hire 
or. the vector is located further away fro:?, the opening site than 
the frameshift or. the fragment making the active cer.es mosaics that 
are related :: the opening site see figure 2.V . In Table 23 a 

very low number z : actives are observed when there are 2 
frameshifts located or. the vector. X:s: of these active colonies 
are mosaics of the "parent'' ON A meaning that the mixing is not 
related to the opening site {see figure 23) . 



Recombination with two different vectors or fragments 

The result of recombination with two different vectors or 
fragnments the test is shown in Table 2 



Vector * Fragment 



Table 2 

Number of 
colonies 



% of colonies with active 
Lioolase 



Blue 428/pstI + 
Blue 429/ost # 



13 



15% 



3iue428/pst + Blue 
429/PstI r 442 



273 



4.2% 



'IT 



.Blue 442/pstI + 428 + 
429 



222 



0.8% 



Blue 443/pstI 
428 



427 



229 



1.6% 



Recombinations with 2 different vectors or fragments. # Determined 
by 1 plate. 

A low number of colonies are seen for the control experiment in rov 
1 of table 3 as expected. The fragment added in the middle row has 
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two frameshifts each corresponding to the frameshift on each 
vector. Via a tripartite recombination 4.2% actives are created. 
With two fragments with each one frameshifts and a vector with the 
same two frameshifts very few actives are found. 

Recombination with vectors opened at different sites 

Opening the vector in one side instead of approximately in the 
middle still gives good recombination as shown in Table 4. Two 
vectors opened at different sites can also recombine to some extent 

(compare with the vector controls in table 13; . 



V e z z o r * T r a z rr. e n z 


N umb e r or 


j h of coionies with active 
! Liooiase 


Blue 42 3/xho - -12 9 




i in 


Blue 42S/xhc-51ue 
429/pst rr 


i 


I 5.3% 

! 


w „ o _ u..e <■ e — - ^ _ 
c e t e r 77; i r. s d b v 6 o i a t e 


i r. one s - - e : r. s : 

5 . 


sac on in z h e middle. " 


R e cg.tJo i n a t i c n a: ciff 


° r ° " t r o n c s r. t r 5 1 1 ' 


:ns of vector and fraament 


t o relative c c r. c e r. t, r 


a : : : r. zz vector to fra gme n t do influence the 


oercer. race of c:s:::v 




be seen in Table 5. 






Table 5 




Vector * Fragment 


Number of 
colonies 


% of colonies with lipase 
activity 




0 . 5jil 31 ue 4 2 6 + 
3^1 442 


42 


42% 




l.oji! Blue 425 -r 
3ui 4 42 


21 


51% 




l.Sul 3iue 425 r 
9ul 442 


34 


26% 




1.5)il Blue 426 + 
3fii 427 


230 


2.8% 




1^1 Blue 442 ■+ ljil 
'425 


224 


1.16% 




1^1 Blue 442 - 2|il 


429 


0.9% 
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425 






Ijil Blue 442 + 4ul 
425 


434 


1.6% 


lul Blue 442 + 8ul 
425 


481 


1.6% 


lul Blue 442 + 16ul 
425 


497 


2.0% 



Varying the concentration of the vector or fragment. 



Recombination with fragments of different size 

The size of the fragment also influences the recombination result 
as seen ir. Table 6. 



Vector -r Fragment j Number cf ' * of colonies with active 

I ccicnies Lipciase 


i - ~ -) . a i 

Blue 4 2 4 ~ 4 2 5 | / j j*** 
(2 60bp: | 


31ue 424 * 425 
MB 9b? I 


130 j 45% 

j 


3iue 424 + 424 
(430b?) 


153 ! 0.3% 

i 
t 
t 


3iue 424 + 425 
(430bp) 


130 


35% 


31 ue 4 2 5 * 425 
(480bp; 


150 i 28% ! 
i 


31ue 425 r 424 
(480bp! 


6 9 


0% 


3iue 425 + 423 
(480bp) 


63 


55% 



Recombination with smaller fragments than 900 bp. 



Recombination with unopened vectors 

Transformation with unopened vectors shows a very low degree of 
recombination (Table 7) . 



Table 7 



Plasmid 


Number of 
colonies 


% of colonies with active 
Lipolase 


31ue 428 + 31ue 
429 


887 


0.3% 
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Blue 426 + Blue 


697 


0.7% 


425 







Recombination of unopened plasmids. 



Example 4 

5 

Test of 5. cerevisiae mutants altered in recombination 

Using the same approach as described in Example 3 recombination of 
opened and unopened vectors and fragments were tested using a 
10 Saccharomyces cerevisiae rad52 mutant as the recombination host 
ceil. The result is displayed in Table 8. 

Tabl'e 3 



Vector + 
Fr agme r. t 


Number of 
colonies 


1 of colonies with active 
Lipol a se 


3iue 4 2 3 '+ 423 


0 


i 

i u 


31ue 442 - 427 


o 


i 


Blue 424 r 425 


0 




31ue 425 -r 443 


o ! o 


Fias.T.id dJSO 
37 


544 


100% 



Recombination result in razz- 2 mutant. 



15 The result with rad52' showed that recombination was completely 
abolished. The RA D 5 2 function is required for classical 
recombination (but for unequal sister-strand mitotic 

reccmbina t ion ) show in r that the reccmbinat ion of opened vector and 
fragment could involve a classical recombina t ion mechanism. 

20 

Example 6 

Recombination of multiple partial overlapping fragments 

2 5 ' ~ 

In order to increase the mixing of the mutations by the 
recombination method of the invention, recombination of two 
fragments and one gapped vector were attempted. 

30 Table 15 



Vector + Fragment 


Number of 


I of colonies with lipase 




colonies 


' activity 


1. pJS037/HindIII-XhoI 


> 2000 


100% 


+ PCR319+PCR327 






2. DJS037/HindIII-XhoI 


» 2000 


* 0.21 


+ PCR321+PCR331 
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3. pJS037/HindIII-XhoI 
+ PCR319+PCR331 


~ 1500 


ft 1% 


4. pJS037/HindIII-XhoI 
+ PCR319+PCR386 


> 5J00 


> 90% 


5. pJS037/Hindiri-XhoI 
+ PCR321+PCR386 


> 5000 


* 25% 


6. Blue 428/HindIII- 
Xhol + PCR321+PCR331 


400 


0.2% 


7. Blue 428/HindIII- 
Xhol + PCR319+PCR327 


ft 1500 


-—.._> 90% 


8. Blue 428/HindIII- 
Xhol + PCR321+PCR327 


~ 150 


ft 10% 


9. Blue 428/HindIII- 
Xhol + PCR327+PCR385 


= 1500 


* 10% 


10. Blue 429/HindIII - 
Xhol -f ?CR31S+?CR3 3o ' 


=: 400 


ft 15% 


11. Blue 429/Hindi:i- 
Xhol - ?CR321r?CR3£5 


= 350 


ft 15% 


12. 31ue 4 42/Hir.dII I- j = 1500 
Xhol -f PCR315-PCR32 7 j 


= 10% 


13. Blue 423/Kir.di::- 
Xhol r 


2 


0 


14 . Blue 429/KindIi:- 
Xhol * 


0 


0 


15. 31ue 4 42 /Hir.di::- 
Xhol + 


5 


0 


16. Blue 423/Kir.dIII- \ 4 
Xhol + PCR33I 1 


0 


17. 31ue 428/HindIi:- 
Xhol + PCR321 


2 


0 



Recombination result of two fragments and a gapped vector. The last 



5 rows are controls. 

As can be seen in Table 15, the recovery of the Humicola lanuginosa 
5 lipase gene is very efficient. The last 5 rows in Table 15 shows 

that the opened vector alone or with only one fragment not covering 
the whole gap (see figure 3) gives only very few colonies. 

The first row is with wild type fragments gives 100% of active 
10 colonies. 

The second row is with two fragments each containing a frameshift. 
The fragment PCR331 fragment has the frameshift located at the 
Bglll site which, in this recombination, is not covered by a wild 
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type fragment (see figure 3) and therefore gives about 01 of active 
lipase. The same is the case for row 3 and 6. 

In the row 4, fragment PCR386 containing a frameshift at the SphI 
site which is overlapped by wild type sequences in the gapped 
vector. The frameshift was recombined into less than 10% of the 
genes which is lower than the result for one fragment recombination 
in the last row of Table 1A above. 

In row 5 a rather high mixing is observed between the 2 fragments 
each containing a frameshift and the .wild type gapped vector givinc 
25% active and 75% inactive lipase "colonies . This is probably due 
to that the fragment PCR321 has the frameshift in the overiao 
between the 2 fragments and in the gapped region of the vector. If 

fragment ?CR3 = c contributes to 10% inactives like in row 4, 
fragment ? C P. 3 2 1 ::ves the remaining iz\ inactives - therefore 
PCR335 gives 25* wt m the overiao. 



Row 7 is the "mirror image" : : r:v 4 ~ith the frameshift at the 
SphI site or. the vector 'see Figure 7 and 2 wild tvpe fra cments 
giving an integration c: the vile type fragment into more than 90% 
of the vector s . 

Row E shows ii>e m rev 5 that the frameshift of PCR321 in the 
overlap and gap region gives a very high number of inactive. 

In row 9, fragment PCR335 with a frameshift in the vector overlap, 
causes a very high number of inactives. 

Row 10 gives a rather high- number of inactives compared to row 7 
and 4. It is not increased in row II. 

Row 12 shows that two frame shifts on the vector gives a lower 
number of actives compared to one in row 7. 

The recombination of 3 partial overlapping fragments into a gapped 
vector is also very efficient as seen in Table 16. The last row 
with the vector alone gives very few colonies. As can be seen in 
figure 4 all fragments used are wt. In the first row in table 16, 
there are rather long overlaps between the vector and fragments, 
but in the middle row the overlap between PCR353 and 355 is only 10 
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bp long and it is stiii very efficiently recombined! This 
surprising result may be utilized for very easy domain shuffling of 
even distantly related genes. For example can 3 different domains 
from 10 different genes be made as PCR fragments, designed to have 
5 a 10 to 20 bp overlap by primer design and recombined together and 
subsequently screened for the best combination (1000 possible 
combinations J . 





Table 16 




Vector + Fragment 


Number of 
colonies 


% of colonies with active 
Lipolase 


pJS037/Pvui:-SpeI + 
? C ?. 3 5 3 ~ ? C R 3 5 4 - ? C ?> 3 c 7 


> 5000 


100% 


pJS03 7/ ?vul I -Spel + 
PC R 3 5 3 ~ ?C ?. 3 5 5 - ?C R 3 5 7 


> 50 00 


100% 


pJ5037/?vuI I-Spel 


2 0 j 


:oc% 


?i s c o rrJo i n 3 . i o r. *" a s *-i l ~ of. 




rEcos z v6C"or . T r. e iasi. 



row is s. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Novo Nordisk A/S 

(B) STREET: Novo Alle 

(C) CITY: Bagsvaerd 

(E) COUNTRY: Denmark 

(F) POSTAL CODE (ZIP): DK-2880 

(G) TELEPHONE: +45 4444 8688 

(H) TELEFAX: +45 4449 3256 

(ii) TITLE OF INVENTION: Method for preparing polypeptide variants 

(iii) NUM3ER OF SEQUENCES: 15 

(iv) (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk " 

(3) COMPUTER: I3M PC* compatible 

(C) OPERATING SYSTEM: PC- DOS /MS -DOS 

(D) SOFTWARE: ?aten:In Release *:.0, Version =1.303 (E?0) 



(2) infcpmation for seq id no: 

Ii) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2C base pairs 
(Hi TYPE: nucleic acic 

(C) STRANOEDNESS : single 

(D) TOPOLOGY : linear 

: i i } MOLECULE TYPE: other nucleic acic 

(A) DESCRIPTION: /cesc = "Primer 234.3" 



20 



;2i INFORMATION FOR SEQ ID NO: 2: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(3) TYPE: nucleic acic 

(C) STRANOEDNESS : sir.cie 

{ D) TOPOLOGY: linear 
{ii) MOLECULE TYPE: other nucleic acid 

{A) DESCRIrTION: /cesc = "Primer 4 599" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CGGTACCCGG GGATCCAC 13 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 18 base pairs 
(3) TYPE: nucleic acic 

(C) STRANOEDNESS : single 

(D) TOPOLOGY: linear 

Jii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /cesc = "Primer 5164" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AATTACATCA TGCGGCCC 18 
(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS: 
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10 



50 



(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 8487" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CATTTGCTCC GGCTGCAGGG A 21 
(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 60 base oairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 4548" 
20 (xi) SEQUENCE DESCRIPTION : SZQ ID NO: 5: 



G:G. * CZGCC GGTCTGTACG GTCAGG.-_-.TT CTGC* 
'-l i o . - T ^C G AC i CGGoGGG 

25 [2) INFORMATION FOR SZQ ZD NO: f: 

(A) LENGTH : 21 base r a : r 5 

(B) TYPE: nucleic i::: 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: other nucleic a 

(A) DESCRIPTION: .':2s: - " r 
(xi) SZQVZUCZ DESCRIPTION: SZZ ZD I 
35 

GGiCi'w-: r.C G G T C AG G .--A T T C 21 

(2) INFORMATION FOR SE2 ID NC : ": 

(A) LENGTH : 13 oase rsirs 

(3) TYPE: nucleic 2::: 

(C) S T RAN DE DN ESS: s 1 r. r i e 

(D) TOPOLOGY: linear 

4 5 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /cesc = "?ri~er 5573" 
(xi) SEQUENCE DESCRIPTION: SZQ ID NO: " : 



CGTTTCGGGT GACGGGGA.C 

(2) INFORMATION FOR SEQ ID NO: 



(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 18 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : sincle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 1596" 
60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GGAGCAAATG TCATTTAT 18 



(2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Primer 4545" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



GCATTGGCAA CTGTTGCCGG AGCAGACCTG CGTGGAAATG 

GGTATGATAT CGACGTGTTT TCAT 64 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base Dairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
{ D} TOPOLOGY: circular 




.-. i G o o r. ^ . CC C . . G . o C . ' ^ . . _ . . * u i . C * u*wG * ou r.C j GCC TTG 4 8 

Met Arc Ser Ser Leu Val Leu Pr.e Phe Val Ser Ala Trr; Thr Ala Leu 

i i 10 15 

GCC AG? CC7 ATT CGT CGA GAG G7C TCG CAG GAT C7G TTT AAC CAG TTC 95 

Ala Ser lie Arr Ar r Glu Val Ser Gin Asd Leu Phe A s n Gin Phe 

20 ' 25 30 

AA7 CTC 77? GGA CAG 7A7 7C7 GCA GCC GCA TAG TGC GGA AAA AA.C AA7 14 4 

Asn Leu Phe Ala Gir. Tvr Ser Ala Ala Ala 7vr Cys Gly Lvs A sr. Asn 

35 40 45 



GAT GCC CCA GC7 GG7 AC A AAC ATT ACG TGC ACG GGA AAT GCC TGC CCC 192 
Asd Ala Pro Ala Glv Thr Asn lie Thr Cvs Thr Gly Asn Ala Cys Pro 
50 '55 * 60 

GAG GTA GAG AAG GCG GAT GCA ACG 777 C7C 7 AC TCG TTT GAA GAC TCT 240 
Glu Val Glu Lvs Ala Asd Ala Thr Phe Leu Tvr Ser Phe Glu Asd Ser 
65 70 75 * 80 

GGA GTG GGC GAT GTC ACC GGC TTC CT7 GCT CTC GAC AA.C ACG AAC AAA 283 
Gly Vai Gly Asd Val Thr Gly Phe Leu Ala Leu Asd Asn Thr Asn Lys 
85 50 95 

TTG ATC GTC CTC TCT TTC CGT GGC TCT CGT TCC ATA GAG AAC TGG ATC 336 
Leu He Val Leu Ser Phe Arg Glv Ser Arc Ser He Glu Asn Trp He 
100 - 105 110 

GGG AAT CTT AAC TTC GAC TTG AAA GAA ATA AAT GAC ATT TGC TCC GGC 384 
Gly Asn Leu Asn Phe Asd Leu Lys Glu He Asn Asp He Cys Ser Gly 
115 * 120 125 
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TGC AGG GGA CAT GAC GGC TTC ACT TCG TCC TGG AGG TCT GTA GCC GAT 4 32 

Cys Arg Gly His Asp Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Aso 
130 135 140 

5 ACG TTA AGG CAG AAG GTG GAG GAT GCT GTG AGG GAG CAT CCC GAC TAT 4 80 

Thr Leu Arg Gin Lys Val Glu Asp Aia Val Arg Glu His Pro Aso Tyr 
145 150 155 " 160 

CGC GTG GTG TTT ACC GGA CAT AGC TTG GGT GGT GCA TTG GCA ACT GTT 528 
10 Arg Val Val Phe Thr Gly His Ser Leu Gly Gly -Ala Leu Ala Thr Val 

165 170 175 

GCC GGA GCA GAC CTG CGT GGA AAT GGG TAT GAT ATC GAC GTG TTT TCA 57 6 

Ala Gly Ala Aso Leu Arg Gly Asn Gly Tyr Asp lie Asp -Vai^Phe Ser 
15 180 185 190 

TAT GGC GCC CCC CGA GTC GGA AAC AGG GCT TTT GCA GAA TTC CTG ACC 62 4 

Tyr Gly Aia Pro Arg Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr 
1S5 200 . ' 205 



20 



40 



45 



55 



GTA CAG ACC GGC GGA AC A CTC 7 AC CGC ATT ACC CAC ACC AAT GAT ATT 672 
Val Gin Thr Glv Giv Thr Leu Tvr Arc lie Thr His Thr Asn Aso lie 
210 * 215 220 



2 5 GTC CCT ACA CTC CCG CCG CGC GAA 7 : C GG; AG~ CAT 7C7 AGC CCA 720 

7a i Pro Ar~ Leu Pro Pro Arc Glu ?h-e Giv Tvr Ser His Ser Ser Pro 

225 ' 230 ' 125 240 

GAG .AC 7Go A . T AAA 7C . o_-A AC ^. C. . - : - — oi^ .-iw^ CvjA A_AC GA7 7 55 

30 Glu 7vr 7r= He Lys Ser Glv 7hr Leu Val ?rc val 7hr Arc Asn Aso 

245 250 235 

A7C G7G A_AG A7A GAA. GGC ATC GA7 GCC ACC GGC AA.7- A_AC CAG CCC 316 

lie Vai Lys He Glu Giv lie Aso Aia 7hr 31 v Giy Asn Asn Gin Pro 

35 ' 260 " " 265 270 

A.AC ACT CCG GAT ATC CCC GGG CAC CCA 7GG CAC 77C GGG 77A A7T GGG 8 54 

Asn lie Pro Aso lie Pro Ala His Leu Tro Tvr Phe Glv Leu lie Giv 

275 * 28: 235 



Cvs Lev 
2 90 



(2) INFORMATION TO R SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 292 amino acids 
50 { B) TYPE: a-ino acid 

(D) TOPOLOGY: linear 



!ii) MOLECULE TYPE: orotein 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 11: 

Met Arg Ser Ser Leu Vai Leu Phe Phe Vai Ser Ala Tro Thr Aia Leu 
15 10 15 



Ala Ser Pro lie Arc Arc Glu Val Ser Gin Aso Leu Phe Asn Gin Phe 

60 * 20 25 30 

Asn Leu Phe Aia Gin Tyr Ser Ala Ala Aia Tvr Cys Gly Lys Asn Asn 

35 40 45 

65 Aso Aia Pro Ala Giy Thr Asn lie Thr Cvs Thr Gly Asn Ala Cys Pro 

50 55 60 
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Glu Val Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe GIu Asp Ser 

65 70 75 80 

Glv Val Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 

5 " 85 90 95 

Leu lie Val Leu Ser Phe Arg Gly Ser Arg Ser lie Glu Asn Trp lie 

100 105 no 

10 Glv Asn Leu Asn Phe Asd Leu Lys Glu He Asn Asp lie Cys Ser Gly 

115 ' 120 125 



15 



30 



45 



60 



65 



Cys Arg Gly His Asp Gly Phe Thr Ser Ser Trp Ar<? Ser__y.al_Ala Asp 
130 135 140- 

Thr Leu Arg Gin Lys Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr 

145 150 155 160 



Arc Val Val Phe Thr Glv His Ser Leu Gly Gly Ala Leu Ala Thr Val 

20 ' 165 * 17C 175 

Ala Glv Ala Aso Leu Arc Glv Asn Gly Tyr Asp lie Asp Val Phe Ser 

. i8b ' :sf ir: 

25 Tvr Glv Ala Pro Arc Val Gly Asn Arc Ala Phe Ala Glu Pr.e Leu Thr 

* 195 " 200 205 



Val Gin Thr Glv Glv Thr Leu Tyr Arc 11-= Tr.r His Thr Asn Asp L'e 

210 ' * 215 220 

Val Pro Arc Leu Pro Pro Arc Glu Phe Gly Tyr Ser His Sar Ser Pro 

225 ' 230 * 225 240 

Glu Tvr Tro lie Lvs Ser Glv Thr Leu Val Pro Val Thr Arc Asn Asp 

35 " 245 ' 251 255 

lie Vai Lvs lie Glu Glv lie Asp Ala Tr.r Gly Gly Asn Asn Gin Pro 

260 * 265 210 

40 Asn He Pro Aso lie Pro Ala His Leu Trp Tyr Phe Gly Leu lie Gly 

275* 23 0 285 



Cvs 
230 



(2- INFORMATION TOP SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 876 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: circular 

55 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Vector pJS037' 



(vi) ORIGINAL SOURCE: 

(B) STRAIN: Kami col a lanuginosa 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
.(B) LOCATION: 1. .876 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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ATG AGG AGC TCC CTT GTG CTG TTC TTT GTC TCT GCG TGG ACG GCC TTG 4 8 

Met Arg Ser Ser Leu Val Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
15 10 15 

GCC AGT CCT ATA CGT AGA GAG GTC TCG CAG GAT CTG TTT AAC CAG TTC 96 
Ala Ser Pro lie Arg Arg Giu Val Ser Gin Asp Leu Phe Asn Gin Phe 
20 25 30 

AAT CTC TTT GCA CAG TAT TCA GCT GCC GCA TAC TGC GGA AAA AAC AAT 14 4 

Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 
35 40 45 

GAT GCC CCA GCA GGT ACA AAC ATT ACG TGC ACG GGA AAT GCA TGC CCC 192 
Asd Ala Pro Ala Gly Thr Asn lie Thr Cys Thr Gly Asn -Al*_Cys Pro 
50 55 60 

GAG GTA GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT GAA GAC TCT 24 0 

Glu Val Giu Lvs Ala Aso Ala Thr Phe Leu Tyr Ser Phe Glu Aso Ser 
65 " 70 . " 75 80 

GGA GTG GGC GAT GTC ACC GGC TTC CTT GCT CTC GAC AAC ACG AAC AAG 238 
Giv Val Giv Asr Val Thr Glv Phe Leu Aia leu Asp Asn Thr Asn Lvs 
8 5 - 90 95 

CTT .ATC GTC CTC TCT TTC CCT GGC TCA AG.-. . C. A.« GAG AAC TGG ATC 3 35 

Leu lie Vai Leu Ser Phe Arc Glv Ser Arc Ser lie Giu Asr. Tro Tie 

ice ics 11: 

GGG A_AT CTT AAC CTC GAC TTC .AAA G.AA ATA. .--AT GAC ATT TGC TCC CC C 334 
Civ Asn le- Asr. ? r. e Asr le- lvs Glu lie Asr. Aso lie Cvs Ser Giv 
1 15 * 12C 125 

TGC AGG GGA. CAT GAC GGC TTC .ACT TCC TCC . GG AGG TCT GTA' GCC GAT 432 
Cvs Arc G i v H is Asr Glv ?r. e T - r Ser Ser Tro Ar g Ser Val Ai a As o 
130 125 140 

ACG TTA AG Z CAG AAG GTG GAC GAT GCT GTT CGC GAG CAT CCC GAC TAT 4 30 

Thr Leu Arg Gir. Lys Vai Giu Asp Aia Val Arc Glu His Pro Asp Tyr 
145 " 150 155 160 

CGC CTG GTG TTT ACC GGC CAT A.CC CTT GGT GGT GCG CTA GCA ACT GTT 523 
Arc Vai Val Phe Thr Giv His Ser Leu Giv Civ A.ia Leu A.ia Thr Vai 
165 """ " 170 * 175 

GCC G C.A GCA GAC CTG CGT GGA. AA.T GCG TAT vj.-.T ATC GrvC GTG TTT TCA 576 
Aia Giv Ala Aso leu Arg G 1 v As r. Glv Tyr Asp lie Asp Vai Phe Ser 
180 IS 5 " " ISC 

TAT GGC GCC CCC CCA GTC GGT AAC CGT GCT TTT GCA GAA TTC CTG ACC 62 4 

Tvr Giv Aia Pro Arg Val Giv Asr. Arc Ala Phe Ala Glu Phe Leu Thr 
195 ' 200 205 

GTA CAG ACC GGC GGT ACC CTC TAC CGC ATT A.CC CAC ACC AAT GAT ATT 672 
Vai Gin Thr Gly Giv Thr Leu Tvr Arc lie Thr His Thr Asn Aso lie 
210 215 220 

GTC CCT AGA CTC CCG CCT CGA GAA TTC GCT TAC AGC CAT TCT AGC CCA 7 20 

Vai Pro Arc Leu Pro Pro Arg Giu Phe Giv Tyr Ser His Ser Ser Pro 
225 ^ " 230 235 240 

GAG TAC TGG ATC AAA TCT GGA ACA CTA GTC CCC GTC ACC CGA AAC GAT 7 63 

Glu Tvr Tro He Lys Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp 
245 250 255 

ATC GTG AAG ATA GAA GGC ATC GAT GCC ACC GGC GGC AAT AAC CAG CCT 816 
lie Val Lvs He Glu Glv lie Aso Aia Thr Gly Gly Asn Asn Gin Pro 
260 " * 265 270 
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AAC ATT CCG GAT ATC CCT GCG CAC CTA TGG TAC TTC GGG TTA ATT GGG 864 
Asn He Pro Asd He Pro Ala His Leu Trp Tyr Phe Gly Leu He Glv 
275 280 285 



ACA TGT CTT TAG 
Thr Cys Leu 
290 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 92 amino acids 
(3) TYPE: anii.no acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Drotein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO*: 13: 

Met Arc Ser Ser Leu Vai Leu Phe Phe Vai Ser Ala Tro Thr A 1 a L^" 
1 5 10 * is 

Ala Ser Pro lie Arc Arc GIj Vai Ser Gin A .~ z leu Phe Asp, Gir. =>~o 
2: 25 30 

Asr. Leu Phe Ala Gir. Tvr Ser Ala Ala Ala Tvr Cvs Glv Lvs Asr As- 

Asp Ala Pre Ala Gly Tr.r Asr. lie Thr Gvs Tr.r Glv Asn Ala Cvs P-c 
SO =5 60 

Giu Vai Glu Lys Ala Asr; Ala Thr Pr.e Leu Tvr Ser Phe Giu" Asd Se~ 
65 - * 50 

Gly Vai Gly Asp Vai Thr Glv Phe Leu Ala Leu Aso Asn Thr Asn Lvs 
35 ■* 90 * 95 ' 

Leu lie Vai Leu Ser Pr.e Arz Gly Ser Arc, Ser lie Giu Asn Trp He 
-30 105 iiO 

Giy Asn Leu Asn Phe Asc leu lvs Glu lie Asn As 3 He Cvs S°r Glv 
1:5 120 * - 125 

Cys Arc Giy His Asp Gly Phe Thr Ser Ser Tro Arc Ser Vai A-ia Aso 
130 13= 140 

Thr Leu Arc Gin Lvs Vai Giu Aso Ala Vai Arc Giu His Pro Asd Ty- 
143 150 155 * 160 

Arc Vai Vai Phe Thr Giy His Ser Leu Giy Giy Ala Leu Ala Thr Vai 
165 170 175 

Ala Gly Ala Aso Leu Arc Glv Asn Giv Tvr Aso lie Aso Vai Phe Se- 
130 185 " 190 

Tyr Gly Ala Pro Arc Vai Giv Asn Arc Ala Phe Ala Glu Phe Leu Thr 
195 200 205 

Vai Gin Thr Gly Giy Thr Leu Tyr Arg lie Thr His Thr Asn Asp He 
210 215 220 

Vai Pro Arg Leu Pro Pro Arg Glu Phe Giy Tyr Ser His Ser Ser Pro 
225 230 235 240 

Glu Tyr Trp He Lys Ser Gly Thr Leu Vai Pro Vai Thr Arg Asn Asp 
245 250 255 



876 
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lie Vai Lvs Glu Gly He Asd Ala Thr Gly Giy Asn Asn Gin Pro 
260 265 270 

5 Asn He Pro Aso He Pro Ala His Leu Trp Tyr ?he Gly Leu lie Gly 
275 * 280 285 



10 



Thr Cys Leu 
290 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 864 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE : DMA (genomic) 

(vi) ORIGINAL SOURCZ: 

;BJ STRAIN: Pseudc-cnas so. 

2 5 {ix; FEATURE: 

;A) NAME/KEY: -ac_peptide 
iz) LOCATION 1 : : . . 364 

Ux) FEATCRE: 
30 :A; NA.XE/KEY: CDS 

{ B J LOCATION: 1 . .8 64 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

35 

TTC GGC TCC TC3 AAC TAG ACT AAG ACC CAG TAT CCG ATC GTC CTG ACC 4 3 

Pr.e Glv Ser Ser Asn Tvr Thr Lvs Thr Gir. Tyr Pro lie Val Leu Thr 

1 * 5 * " 10 15 

4 0 CAC GGC AT 3 CTT GGT TTC GAC AGC CTG CT" GGA GTC GAC TAG TGG TAG 96 

His Glv Met Leu Giv ?r,e Aso Ser Leu Leu Giy Val Asp Tyr Trp Tyr 

2Z 25 30 

GGC ATT OCT TTA GCC CTG CGT .AAA GAC GGC GGC ACC GTC TAC GTC ACC 14 4 

4 5 Glv lie Pre Ser Ala Leu Arg Lys Asp Giy Aia Thr Val Tyr Val Thr 

35 40 45 

GAA GTC AGC CAG CTC GAC ACC TCC GAA GCC CGA GGT GAG CAA CTG CTG 192 

Glu Val Ser Gin Leu Aso Thr Ser Glu Ala Arc Giy Giu Gin Leu Leu 

50 50 ' 55 60 



55 



ACC CAA GTC GAG GAA ATC GTG GCC ATC AGC GGC AAG CCC AAG GTC AAC 2 40 

Thr Gin Val Glu Giu lie Val Ala lie Ser Giy Lys Pro Lys Val Asn 
65 70 75 80 



CTG TTC GGC CAC AGC CAT GGC GGG CCT ACC ATC CGC TAC GTT GCC GCC 288 
Leu Phe Giv His Ser His Giy Giy Pro Thr lie Arg Tyr Val Ala Ala 
85 90 95 



60 GTG CGC CCG GAT CTG GTC GCC TCG GTC ACC AGC ATT GGC GCG CCG CAC 336 
Va 1 Arg Pro Aso Leu Val Aia Ser Val Thr Ser lie Gly Ala Pro His 
100 105 110 

AAG GGT TCG GCC ACC GCC GAC TTC ATC CGC CAG GTG CCG GAA GGA TCG 38 4 

65 Lvs Giv Ser Aia Thr Ala Asp Phe He Arg Gin Val Pro Glu Gly Ser 
115 120 125 
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GCC AGC GAA GCG ATT CTG GCC GGG ATC GTC AAT GGT CTG GGT GCG CTG 432 
Ala Ser Glu Ala He Leu Ala Gly He Val Asn Gly Leu Gly Ala Leu 
130 135 140 



5 ATC AAC TTC CTT TCC GGC AGC AGT TCG GAC ACC CCA CAG AAC TCG CTG 
He Asn Phe Leu Ser Gly Ser Ser Ser Asp Thr Pro Gin Asn Ser Leu 
145 150 155 160 



20 



40 



50 



GTG GTC AAT GGC GTG CGC TAT TAC TCC TGG AGG GGC ACC AGC CCG CTG 
Val Val Asn Gly Val Arc Tyr Tyr Ser Tro.Arg Gly Thr Ser D -o Lo- 
195 200 • 205 



(2! ::;rO?:<iATION TO?. SEC ZD NO: 15: 



(i) SEQUENCE CHARACTERISTICS : 
4 5 (A} LENGTH: 23S amino acids 

(3) TYPE : arr.ino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



Phe Gly Ser Ser Asn Tyr Thr Lys Thr Gin Tvr Pro He Val Leu Th* 
1 5 10 15 



His Gly Met Leu Gly Phe Aso Ser Leu Leu Gly Val Aso Tv- T-p Ty- 

55 20 1 25 * 30 

Gly He Pro Ser Ala Leu Arg Lvs Aso Gly Ala Thr Val Tyr Val Thr 

35 40 * 45 

60 Glu Val Ser Gin Leu Asd Thr Ser Glu Ala Arc Gly Glu Gin Leu Leu 

50 55 * 60 

Thr Gin Val Glu Glu He Val Ala He Ser Gly Lys Pro Lys Val Asn 

65 70 75 80 

Leu Phe Gly His Ser His Gly Gly Pro Thr lie Arg Tyr Val Ala Ala 

85 90 95 



480 



GGC ACG CTG GAG TCA CTG AAC TCC GAA GGC GCC GCA CGG TTT AAC GCC 528 
10 Gly Thr Leu Glu Ser Leu Asn Ser Glu Gly Ala Ala Arg Phe Asn Ala 

165 170 175 

CGC TTC CCC CAG GGG GTA CCA ACC AGC GCC TGC GGC GAG _G.Gv GAT TAC 57 6 

Arg Phe Pro Gin Gly Val Pro Thr Ser Ala Cys Gly Glu GT\TAsp Ty- 
15 180 185 190 



624 



ACC AAC GTA CTC GAC CCC TCC GAC CTG CTG CTC GGC GCC ACC TCC CTC 67 2 

Thr Asn Val Leu Aso Pro Ser Aso Leu Leu leu Giv Ala Tr - ' 
210 . 215 220 

2 5 ACC TTC GGT TTC GAG GCC AAC GAT GGT CTG CTC GGA CGC TCC AGC TC~ 720 
Thr Pr.e Gly Phe Glu Ala Asn Aso Giv Leu Vai Giv Arc Cvs Ser 5«- 
225 230 155 ' 24C 

CGG CTG GGT ATG GTG ATC CGC GAC AAC TAC CCC* ATG AAC CAC CTG GAC 753 
JU Arc Leu Gly Me: Val He Arc Aso Asn Tvr Arg Met Asn His Leu As- 

215 25C 255 

GAG GCG ^ C C '"' G ACC TTC GG: * ctg • 1, - c *~* gg a:g ?tg GAG. ACC AGC CCG 816 
Glu V a i Asn Gin Thr Phe Giv Leu Tnr Ser lie Phe Glu Thr Ser Pro 
35 260 * 255 270 

GTA TCG GTC TAT CGC CAG CAA GCC AAT CGC CTC AAG AAC GCC GGG CTC 864 
Va^ Ser Val Tyr Arc Gin Gin Ala Asn Arc Leu Lvs Asn Ala Q-v Le^ 
275 ' 280 ' "235 * 
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25 



50 



Val Arg Pro Asp Leu Val Ala Ser Val Thr Ser lie Gly Ala Pro His 

100 105 110 

Lys Gly Ser Ala Thr Ala Aso Phe lie Arg Gin Val Pro Glu Gly Ser 

115 * 120 125 

Ala Ser Glu Ala He Leu Ala Giv He Vai Asn Gly Leu Gly Ala Leu 

130 135 140 

He Asn Phe Leu Ser Gly Ser Ser Ser Asp Thr Pro Gin Asn Ser Leu 

145 150 155 160 



Gly Thr Leu Glu Ser Leu Asn Ser Glu Gly Ala Ala Arg PtT^'-Asn Ala 
15 165 170 175 

Arg Phe Pro Gin Gly Val Pro Thr Ser Ala Cys Gly Glu Gly Asp Tyr 
180 135 ; 190 

20 Vai Vai Asn Giv Vai Arg Tvr Tvr Ser Tro Arc Giv Thr Ser Pro Leu 
195 * " 200 * 205 



Thr Asn Val Leu Aso Pro Ser Aso Leu leu Lej Giv Ala Thr Ser leu 

210 ' 215 220 

Thr Phe Gly Phe Glu Ala Asr. Aso Giv 1-2- Val Gly Arg Cys Ser Ser 

225 220 23: 240 



Arc leu G i v Met V a 1 lie A r z Aso As r. T v r A r r X ez As n His Leu Aso 

30 '245 " 251 * 255 * 

Giu Vai Asn Gin Thr ?r.e Giv leu Thr Ser He ?r.e Glu Thr Ser Pro 

260 * 2ii 210' 

3 5 Vai Ser Vai Tvr Arc Gin Gin Ala Asn Arc Lej Lvs Asn Ala Giv Leu 

275 * ' 2S: 235 
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PATENT CLAIMS 

1. A method for preparing polypeptide variants by shufflinc 
different nucleotide sequences of homologous DNA sequences by i: 
vivo recombination comprising the steps of 

a) forming at least one circular plasmid comprising a DNA sequence 
encoding a polypeptide, 

b) opening said circular plasmid(s) within the DNA sequence (s) 
encoding the polypeptide (s ) , 

c) preparing at least one DNA fragment comprising a DNA sequence 
homologous to at least a part of tine polypeptide coding region on at 
least one of the circular plasmid { s } , d) introducing at lease one 
of said opened piasmid(s), together with at least one of said 
homologous DNA fragment { s : c:ver: r. z full-length DNA sequences 
encoding said co 1 ypeot ide i s ) or parts thereof, into a reccmbina t ion 
host ceil, 

e) cultivating said rec omb i n a t i c n n o s t ceil, and 

f) screening fcr positive polypeptide variants. 

2. The .method a c cor dine to claim. 1, wherein more than one cycle of 
step a) to f) are performed. 

and- 2, wherein two or more 
one or more homologous DNA 



4. The method according to any of claims i to 3/ wherein the opened 
plasmid(s) is (are) gapped. 

5. The method according to any of claims 1 to 4 wherein the ratio 
between the opened plasmid(s) and homologous DNA fragment (s) are in 
the range from 20:1 to 1:50, preferable from 2:1 to 1:10 (mol 
vector:mol fragments) with the specific concentrations being from 1 
pM to 10 M of the DNA. 

6. The method according to any claims 1 to 5, 'wherein 2 or more, 
preferably from 2 to 6, especially 2 to 4 of the DNA fragments have 
partially overlapping regions. 
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7. The method according to claim 6, wherein the overlapping regions 
of the DNA fragments lies in the range from 5 to 5000 bp, preferably 
from 10 bp to 500 bp, especially 10 bp to 100 bp. 

5 8. The method according to any of claims 1 and 8, wherein at least 
one cycle of step a) to f) is backcrossing with the initially used 
DNA fragments. 

9. The method according to any of claims 1 and 8, wherein the 
10 plasmid(s) is (are) opened in the region around the middle of the DNA 

sequence(s) encoding the polypeptide ( s ) . 

10. The method according to any cf claims 1 to 9, wherein the 
plasmid(s) is (are) opened close to a mutation ir, the DNA sequence { s ) 

15 encoding the polypeptide ( s ) . 

11. The methcc according to any c : claims 1 to 1C , wherein the DNA 
fragment ( s ) prepared in step c) is ; are ) prepared under conditions 
suitable for high, medium or low mutagenesis. 

20 

12. The method according to any cf claims 1 to 11, wherein the 
polypeptides producible frorr. the input DNA sequences are enzymes or 
proteins with biological activity. 

25 13. The method according to ciair; 12, wherein the polypeptides are 
er.zyr.es selected from the group including proteases, lipases, 
cutinases, ceiiuiases, amylases, peroxidases, oxidases and phytases. 

14. The method according to claim 12, wherein the polypeptides are 
30 proteins with biological activity selected from the group including 

insulin, ACTK, glucagon, somatostatin, somatotropin, thymosin, 
parathyroid hormone, pigmentary hormones, somatomedin, erythro- 
poietin, luteinizing hormone, chorionic gonadotropin, hypothalamic 
releasing factors, antidiuretic hormones, thyroid stimulating 
35 hormone, relaxin, interferon, thrcmbopoietin (TPO) and prolactin. 

15. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DNA sequences is a wild-type DNA 
sequence, such as a DNA sequence coding for wild-type enzymes, in 

4 0 particular lipases, derived from filamentous fungi, such as Humicola 
sp., in particular Humicola lanuginosa, especially Humicola 
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lanuginosa. DSM 4109. 

16. The method according to claim 15, wherein at least one of the 
input DNA sequences is selected from the group of vectors (a) to (f) 
and/or DNA fragments (g) to (aa) coding for Humicola lanuginosa 
lipase variants. 



17. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DNA sequences is a wild-type DNA 
sequence, such as a DNA sequence coding for wild-type enzymes, in 
particular lipases, derived from filamentous fungi of the genera 
Absidia, Rhizopus, Enericella , Aspergillus, Penicillins, 
Zupenici Ilium, Paecilomyces , Ta laro-yces , Th err.za s cus and 
Sclerocleis za . 



east 



one of the initially used input DNA sequences is a wild-type DNA 
sequence, such as a DNA sequence ccdm; for wild-tvc-e enzvmes, in 
particular lipases, derived from bacteria, such as Pseudor.or.as so., 
in particular Ps . fragi, Ps . sturzeri, ?s . cepacia r Ps . fluorescer.s , 
Ps. plantar!! , Ps . gladioli, Ps. alcaiigep.es , Ps . pseudoalcaligenes , 
Ps . menace ina , Ps. a'jrogir.osa , Ps . gl umae , Ps . syrincae, Ps . 
wisconsinensis , or a strain c : Bacillus sp . , in particular B. 
subzilis, S. szearczher-ophiius or c: B. pumilus, or or a strain of 
Szrepzcnyces sp., in particular 3. scabies, or a strain o: 
Ch rcxobaczeri ux sp . in oarzicuiar C. viscosum. 



19. The method according to any of claims 1 to 13, wherein at least 
one of the initially used input DNA sequences is a variant DNA 
sequence, such as a DNA sequence codinc for a variant enzyme, in 
particular lipase variants, derived from yeasts, such as Candida 
sp., in particular Candida rugosa, or Geozrichux sp. , in particular 
Geotrichum candidum. 

20. The method according to any of claims 1 to 19, wherein the 
homologous input DNA sequences are at least 60%, preferably at least 
70%, better more than 80%, especially more than 90%, and even up to 
100% homologous. 



The method according to any of claims 1 to 20, wherein the 
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recombination host cell is a eukaryotic cell, such as a fungal cell 
or a plant ceil. 

22. The method according to claim 21, wherein said fungal cell is a 
5 yeast cell from the group of cell of Saccharomyces sp., in 
particular strains of Saccharomyces cerevisiae or Saccharomyces 
kluyveri or Schizosaccharomyces sp., in particular 

Schizosaccharomyces pombe, or Kluyveromyces sp., such as K. lactis, 
or Hansenula sp., in particular H. polymorpha, -oz-~Pichia sp., in 
10 particular P. paszoris, or a filamentous fungi from the group of 
Aspergillus sp., in particular A . niger, A. nidulans or A. oryzae, 
or Neurospora sp,, or Fusariux sp.,. in particular F. oxysporum, or 
Trichoderma sp. . 

15 23. The method according to any of claims 1 zo 22, wherein the 
pias-id ON*. 1 , sequence { s ; ceding for the pel ypep: ice ( s ) is (are) 
coerablv linked tc a red i cation seouer.ee. 

24. The method according :d claim 22, wherein *: h e piasmic DNA 
20 sequence (s) encodinc the pel ypep t i de { s . is {are} operabiv linked to a 



25 



i o n a i or orr.o t e r s e cu e n c e . 

25. The rr.ee hod a c cor cine zz claim 2 4, wherein the piasmid is an 
expression plastic. 

26. The method a c cor dine zz ciairr. 2 5 , wherein tr.e exoression 
piasmid is pJS02 6 or pJ5037 . 

Title: Method for oreoarino ooiypepcice variants 
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A7GAGGAGC7CCC77G7GC7G7TC777G7C7C7GCGTGGACGGCC77GGCCAG7CC"A77 
5447 — - -r r + + + + —.!_„ 5 c 06 

1 MRSSLVLFFVSAWTALASPI 



CG7CGAGAGG7C7CGCAGGATC7G777AACCAG7TCAA7C7C777GCACAG7A77C7GCA 
5507 + + + + + + 

21 RREVSQDLFNQFNLFAQY S~~A~ 



GCCGCA7AC7GCGGAAAAAACAATGA7GCCCCAGCTGGTACAAACAT7ACG7GCACGGGA 
5567 * + * -r + + 5626 

41 AAYCGKNNDAPAGTNITC7G 
AATGCCTGCCCCGAGG7AGAGAAGGCGGATGCAACGT7TCTCTACTCGITTGAAGACTCT 

5627 * + * * + + —"11 cror 

61 NACPEVEKADATrLYSFEDS - 

GGAG7GGGCGA7G7CACCGGC77CC77GC7C7CGACAACACGAACAAA77GA7CG7CC7C 
5637 * * ■* <* *-.irr 574' 

51 G v G D V 7 G F I A C D N ! 7 N K L I V L - ° 
_ _ , ^ * - * * - v-Co7GGC . C7CG7TCCATAGA3.-_-.CT 3 3A7C3G0.-_-.7C77.-ACTTCGAC7TG.-An 

121 : - • ; - : = 5 c c = : :-: d c r t s s w ?. D - ° " 

5 V - : T L ?. 0 K V Z Z A V R r H / D v *l 2 ° 

5527 ir_^_!_!2 J ~~ -as- 

« - . - _ _ _ _ . „ _ . "* d93o 

° " - - -* - ~ ~ A T V A G A D 

_ _ _ w * 0,w * - j - - * * - - * - ~ ~ - w i - i - . - . A . -ov. -t'v-C wCCGAGTCG 3 AAAC 

- ^ _ * _ _ _ _ _ _ _ ^ 60s 5 

- = - - « N ^ : - -v ■. : i Y G A ? ?. V G N 

c;4j --------------- - 61C5 

-- ' Q.GG7LYRI7 H 

r „ A ~-^ TGATAT "7CCC7AGAC7CCCGC^^^ 

c_0/ ^ 

9 - - - _,_,__ olbo 

2*i i v - R l ?? r. E f G Y S H S S ? 

„ ^'^'J "i" AC 7G G AT C AAATC 7 G G AA C C C 7 7 G 7 CCC C 3 7C A CCC G AAACG AT AT CG 7G AAG A T A 
oi6/ ^ * " 

1 s - - - r A . r, S G i v r V 7RNDIVKI 

GAAGGCATCGA7GCCACCGGCGGCAA7AACCAGCC7A-.CA77CCGGATA7CCC7GCGCAC 
6227 , , + 6286 

2ci =. G i D A 7 G G N N Q ? N I ? d I P A H 

C7A7GG7AC77CGGG77AA77GGGACA.7G7C777AG 
6237 + _ 6322 

261 LWYFGLIGTCL* 
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A7GAGGAGC7CCC77G7GC7G77C777G7C7C7GCG7GGACGGCC77GGCCAG7CC7ATA 
5447 + + + ^ + 55Q6 

1 MRSSLVLFFVSAWTALASPI 
SnaBI pvujl 

CGTAGAGAGGTCTCGCAGGATCTGTTTAACCAGTTCAA7CTCTTTGCACAGTATTCAGCT 
5507 — + + + + - + 5 

21 RREVSQDLFNQFNLFAQYSA 

GCCGCATACTGCGGAAAAAACAATGATGCCCCAGCAGGTACAAACATTACGTGCACGGGA 
5567 ---^ + + , 562g 

41 AAyCGKNNDAPAGTNITCTG 
■ SphI 

AATGCATGCCCCGAGGTAGAGAAGGCGGATGCAACGTTTCTCTACTCGTTTGAAGACTCT 
5627 + j. + + + 5686 

61 NAC PEVEKADATFLYSFEDS 

Kindlll 

GGAGTGGGCGATGTCACCGGCT7CC7TGC7CTCGACAACACGAACAAGCTTATCGTCCTC 
5637 * - * + _ *^ *^ 

SI G V G D V 7 G F L A * 1 D N 7 N *< i " V - " ° 

Bglll 

7C777CCG7GGC7CAAGATCTA7AGA3AA3733A73G3GA.A ~- — 

5-^7 

- _ „ „ DO'JD 

^ : ?. g s r s : £ w : 3 s ■ v - r • 



;Sj7 * l^ZZl 



5526 



i v rt d 7 l ?. 0 .=■: v ■: 0 

3stX: Nh 
-u-ww : G : j7G777ACCG GCCA7 A 3 3 3 7 7 GG 7 3 3 7G 3 GC 



5936 



_ 3stEII 

_ C.GwG7GG.-A-.TGGG7A7GA: AT3GACG7GTT77 3ATAT3GCGG3333CGA37CGG7A.AC 

~ I . . „ ^ ^ ^ ^ _ ^ if _ _ ~ . o 0 4 6 

Kpr. I 

Z- 7 Z 7 6105 

^■-i " « r « £ F L 7 V Q 3 G G 7 I *> ?. t ~ y 

XhcZ 

.-.v. j-.-. 1 >oA7A i 7G7CC G7AGAC7CCCGCC7CGAGAA773GG7TA7A3CCATTCT AGCCC- 

~ * 0 7 , + ^ ^ _ w * *'* v " ^" ' 

1Z. „ ~ ~ * 6166 

2 ^ 1 M D I v ? R L ? P = E F G Y 5 K S S => - 

Spel 

G AG 7 AC 7 G G AT C AAA 7 C 7 G G AACAC 7 AG 7 3C 3 3 G T C AC CCG AAA 3 3 AT A7 C G 7 G AAG AT A 

6167 * ^ 1'Z g226 

2<1 E Y W I k S G 7 L V ? V 7 R M D I V K I °- 



G AAG G CA7 CG A7 G CC AC C G G C G G 3 AA 7 AA 3 3 A 3 3 C 7 AAC A7 7 C CG GA7 A7 CC C 7 G C G C A r 
6227 * * * * _ , Z 

251 EGIDATGGN N G ? N I ? D I P A H 



C7A7GG7AC77CGGG77AA77GGGACA.7G7C777AG 
62S7 r , 6322 

231 LWYFGLIG7CL* 
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